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Crude oil from the Deepwater Horizon spill is 
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J. L. Hyde et al. 

Alphaviruses use secondary structural 
elements in their genomic RNA to avoid host 
detection. 
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N. Konstantinides and M. Averof 

Crustacean limb regeneration relies on 
committed progenitor cells including 
satellite-like muscle precursors. 
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A. S. Dias et al. 

The formation of body segments in vertebrate 
embryos involves local cell interactions 
independent of cyclic gene expression. 
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Oil Is Bad for the Heart 


Crude oil, which commonly makes up the largest proportion of spilled oil, is cardiotoxic to fish 
embryos. To better understand the processes involved, Brette et al. (p. 772) exposed captive 
young tuna to oil samples collected from the Deep Water Horizon spill site to determine the 
mode of cardiotoxicity. Crude oil prolonged the action potential of cardiomyocytes and disrupt- 
ed the excitation-contraction coupling in these cells, functionally disrupting cellular excitability 
and creating the potential for cardiac arrhythmias. Such cardiac impacts may be more broadly 


distributed in vertebrates exposed to crude oil. 


The in-Laws 
Through History 


Admixture, the result of previously distant 
populations meeting and breeding, leaves a 
genetic signal within the descendants’ genomes. 
However, over time the signal decays and can 
be hard to trace. Hellenthal et al. (p. 747) 
describe a method, using a technique called 
chromosome painting, to follow the genetic 
traces of admixture back to the nearest extant 
population. The approach revealed details of 
worldwide human admixture history over the 
past 4000 years. 


Losses and Gains 


In order to better understand the process by 
which de novo genes originate, Zhao et al. (p. 
769, published online 23 January) examined 
testis-based gene expression among Drosophila 
melanogaster strains and identified both 

fixed and polymorphic de novo genes. The 
results suggest that spontaneous activation of 
previously noncoding DNA may be an important 
factor in generating genetic novelty. 


On the Fast Track 


Membranes based on graphene can 
simultaneously block the passage of very small 
molecules while allowing the rapid permeation 
of water. Joshi et al. (p. 752; see the 
Perspective by Mi) investigated the permeation 
of ions and neutral molecules through a 
graphene oxide (GO) membrane in an aqueous 
solution. Small ions, with hydrated radii smaller 
than 0.45 nanometers, permeated through the 


www.sciencemag.org SCIENCE 


GO membrane several orders of magnitude 
faster than predicted, based on diffusion theory. 
Molecular dynamics simulations revealed 

that the GO membrane can attract a high 
concentration of small ions into the membrane, 
which may explain the fast ion transport. 


Robot Rules 


In the case of mound-building termites, colonies 
comprising thousands of independently behaving 
insects build intricate structures, orders of 
magnitude larger than themselves, using indirect 
communication methods. In this process, known 
as stigmergy, local cues in the structure itself 
help to direct the workers. Werfel et al. (p. 

754; see the Perspective by Korb) wanted to 
construct complex predetermined structures 
using autonomous robots. A successful system 
was designed so that for a given final structure, 
the robots followed basic rules or “structpaths” in 
order to complete the task. 


Speeding Up 
Surface Diffraction 


Surface diffraction methods 
can determine the atomic 
structure of the topmost 
layer of a crystal and also 
subsurface structures. 
However, many surface 
diffraction methods either require ultrahigh 
vacuum conditions, which limits the reaction 
conditions that can be studied, or require long 
data acquisition times, which limits temporal 
resolution. Using high x-ray energies, Gustafson 
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et al. (p. 758, published online 30 January; see 
the Perspective by Nicklin) were able to measure 
the intensities of surface-diffracted beams to 
follow the surface oxidation that accompanies 
the changes in a palladium surface during the 
catalytic oxidation of CO with O.,. 


Rolling Under New Madrid 


During 1811-1812, the New Madrid Seismic 
Zone experienced a sequence of three large 
intraplate earthquakes and at least one 
comparably sized aftershock. There have been 
no earthquakes of similar magnitudes since 
then. Using a combination of historical data 
dating back to the original large events and 

an epidemic-type aftershock sequence model, 
Page and Hough (p. 762, published online 23 
January) found that the current low seismicity 
is not part of an aftershock sequence. Instead, 
despite low observable deformation rates, there 
is ongoing accumulation of strain, leaving the 
potential for large earthquakes in the region. 


Keeping Alphaviruses 
Under Wraps 


Viruses mutate to avoid detection, and the host 
responds in kind. For example, 2'-O methylation 
of the 5’ cap of viral RNA allows viruses to escape 
detection by the interferon-stimulated host 
defense protein, IFIT1. Alphaviruses, however, 
lack this modification but are able to remain 
undetected in the presence of IFIT1. How? Using 
a combination of viral mutants and biochemical 
analysis, Hyde et al. (p. 783, published online 
30 January) found that alphaviruses contain 
secondary structural motifs in the 5’ untranslated 
region of their genomic RNA that allow them to 
avoid detection by IFIT1. When these regions 
were rendered nonfunctional, IFIT1 was able to 
keep the virus under control. 


Limb Regeneration 


Flatworms possess pluripotent stem cells that can 
regenerate any cell type in the body, whereas 
vertebrates mobilize committed progenitor cells 
whose fate is predetermined. Investigating limb 
regeneration in a crustacean, Konstantinides 
and Averof (p. 788, published online 2 
January) found that arthropods use committed 
progenitor cells to regenerate missing tissues, 
including satellite-like cells to regenerate muscle. 
The study reveals similarities between arthropod 
and vertebrate muscle regeneration, pointing 
to a common basis for muscle regeneration that 
may date back to the common ancestors of all 
bilaterian animals. 
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Additional summaries 


Toddler Welcome 


It has been assumed that most, if not all, major 
signals that control vertebrate embryogenesis 
have been identified. Using genomics, Pauli 

et al. (p. 746, published online 9 January) 

have now identified several new candidate 
signals expressed during early zebrafish 
development. One of these signals, Toddler, is 

a short, conserved, and secreted peptide that 
promotes the movement of cells during zebrafish 
gastrulation. Toddler signals through G protein— 
coupled receptors to drive internalization of 

the Apelin receptor, and activation of Apelin 
signaling can rescue toddler mutants. 


Fine-Tuning Brain 
Gyrations 


A handful of patients who suffer from seizures 
and mild intellectual disability have now 

led the way to insights about how one piece 

of regulatory 

DNA controls 
development of 

a section of the 
human cortex. 
Imaging the brains 
of these patients, 
Bae et al. (p. 764; 
see the Perspective 
by Rash and 
Rakic) observed 
malformations on 
the surface folds in 
a brain region that 
includes “Broca’s 
area,” the main 
region underlying 
language. The three 
affected families shared a 15—base pair deletion 
in the regulatory region of a gene, GPR56, 
which encodes a G protein—coupled receptor 
required for normal cortical development that is 
expressed in cortical progenitor cells. 


Developmental Complexity 


Although related, the plants Arabidopsis 
thaliana and Cardamine hirsuta have different 
sorts of leaves—one, a rather plain oval and 
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the other, a complicated multipart construction. 
Comparing the development of the two leaf 
types, Vlad et al. (p. 780) uncovered a gene that 
regulates developmental growth. The C. hirsuta 
gene encoding the REDUCED COMPLEXITY (RCO) 
homeodomain protein arose through gene 
duplication and neofunctionalization, but was 
lost in the A. thaliana lineage. In C. hirsuta, 
RCO suppresses growth in domains around 

the perimeter of the developing leaf, yielding 
complex-shaped leaves. A. thaliana, lacking 
RCO, produces simple leaves. When RCO was 
expressed in A. thaliana, the leaves became 
more complex. Thus, the capacity to produce 
complex leaves remains, despite loss of the 
initiator. 


Introducing MARS-Seq 


Immune cells are typically differentiated by 
surface markers; however, this designation is 
somewhat crude and does not allow for fine 
distinctions that might be characterized by their 
RNA transcripts. Jaitin et al. (p. 776) used 
massively parallel single-cell RNA-sequencing 
(MARS-Seq) analysis to explore cellular 
heterogeneity within the immune system 

by assembling an automated experimental 
platform that enables RNA profiling of cells 
sorted from tissues using flow cytometry. More 
than 1000 cells could be sequenced, and 
unsupervised clustering analysis of the RNA 
profiles revealed distinct cellular groupings 
that corresponded to B cells, macrophages, 
and dendritic cells. This approach provides the 
ability to perform a bottom-up characterization 
of in vivo cell-type landscapes independent of 
cell markers or prior knowledge. 


Cell-Cell Interactions in 
Development 


In vertebrate embryos, the number, size, and 
positional identity of mesodermal segments 
(somites) located bilaterally along the anterior- 
posterior axis is widely believed to be controlled 
by a molecular clock of oscillating gene 
expression interacting with a traveling wave of 
signals to determine how many cells make up a 
somite. Dias et al. (p. 791, published online 9 
January; see the Perspective by Kondo) reveal 
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that it is possible to generate somites of normal 
size, shape, and identity without either a clock 
or a wavefront. Instead, the findings suggest 
that somite size and shape are regulated by 
local cell-cell interactions. 


Folding When Wet 


Most globular proteins release water as 

they fold to form a dry hydrophobic core. In 
contrast, Sun et al. (p. 795; see the Perspective 
by Sharp) report a high-resolution structure 
showing that the antifreeze protein Maxi retains 
about 400 water molecules in its core. Maxi 

is a dimer in which two helical monomers 

each bend in the middle to form a four-helix 
bundle. The helices are spaced slightly apart to 
accommodate two intersecting polypentagonal 
monolayers of water. The pentagons form cages 
around inward pointing side chains to stabilize 
the structure. The ordered waters extend to 

the protein surface where they are likely to be 
involved in ice binding. 
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Science Advances 


THE MISSION OF THE NONPROFIT AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE 
(AAAS), the publisher of Science, is to advance science for the benefit of all humankind. 
Science contributes to that mission by communicating the very best research across the full 
range of scientific fields to an extremely broad international audience. The research enter- 
prise has grown dramatically in the past few decades in the number of high-quality practi- 
tioners and results, but the capacity for Science to accommodate those works in our journal 
has not kept pace. Its editors turn away papers that are potentially important, well written, 
of broad interest, and technically well executed. Although other journals provide publishing 
venues for more papers, many authors still desire to be published in Science, a journal known 
for its selectivity, high standards, rapid publication, and high visibility. 

To help meet this need, as well as expand the current content of Science so as to include 
even more diverse topics in science, engineering, technology, math- 
ematics, and the social sciences, AAAS will be launching, in early 
2015, a digital-only journal, Science Advances. Like Science, this 
new publication is designed to encourage transformative research 
and serve a wide readership. Our view at AAAS is that science is 
becoming more integrated and interdisciplinary, and therefore we 
prefer to provide one additional broad journal rather than a number 
of disciplinary titles, each with more limited scope, that would all 
have to be searched to find the papers of most interest. Also like 
Science, this new journal will aim for rapid publication. To 
contain costs for Science Advances, the journal will publish 
original research and review articles only, although select papers in 
Science Advances may be highlighted in Science through News and 
Commentary coverage. 

To ensure the greatest accessibility for authors and readers, the 
new journal will be open access, with publication funded through author processing charges. 
With this publishing model, the number of papers that can be published is limited only by the 
quality of submissions. With digital-only publication, all papers will be posted as soon as they 
are ready for publication. Although Science has long participated in various global efforts to 
give researchers in the world’s poorest countries free access to peer-reviewed results, Science 
Advances will allow AAAS to serve a much larger community of Internet-connected scientists 
who desire to keep current with the latest scientific results. 

The editorial model for Science Advances will be similar to what is used for many society 
journals. A lead editor will be supported by a large number of associate editors who will be 
eminent active scientists. Administrative work will be handled at Science headquarters to 
efficiently process reminders to reviewers and keep manuscripts moving briskly through to a 
decision. Papers favorably reviewed at Science, Science Translational Medicine, or Science 
Signaling but declined for lack of space can be considered automatically for publication in 
Science Advances. This can occur through a cascading process without further review or 
further effort on the part of the authors. The goal is to speed publication, alleviate the burden 
on the reviewer community, and reduce the risk to authors of having to resubmit elsewhere. 

Science Advances will distinguish itself from Science by its editorial model, the imme- 
diate access to papers for all readers, and the fact that acceptance for publication is limited 
only by the quality of the paper. In the coming months, we look forward to recruiting the lead 
editor, associate editors, and inaugural papers that will launch a new resource of high-quality 
research for the scientific community. 

— Marcia McNutt and Alan |. Leshner 
10.1126/science.1251654 
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IMMUNOLOGY 


Sick Space Flies 


A concern regarding manned long-term space missions is that chang- 
es in gravitational force compromise the human immune system, 
but the underlying cellular and molecular reasons have not been 
clear. Taylor et al. studied innate immunity in Drosophila mela- 
nogaster that traveled aboard Space Shuttle Discovery in 2006. 
Flies reared in space were compared to flies that underwent de- 
velopment on Earth. Upon the return of the space-reared flies 
to Earth, both groups of flies were subjected to bacterial (Esch- 
erichia coli) or fungal (Beauveria bassiana) infections, and their 
gene expression profiles were examined. Genes associated with 
Toll receptor—mediated immune responses to fungal infection were 
activated only in the Earth flies. The expression of specific antimi- 
crobial peptides also failed in the space flies. Other mechanisms, such 
as the Imd signaling pathway response to bacterial infection, were not 
affected in space flies. The space flies exhibited increased expression of 
heat shock response genes, a subset of stress response genes that are acti- 
vated to manage aberrant protein folding. The authors suggest that microgravity 
may alter the folding and stability of proteins, which triggers the deployment of heat 


shock proteins that in turn, may interfere with the Toll receptor signaling pathway. — LC 


ENGINEERING 
Growing Graphene Receivers 


One use of the extremely high conductivity of 
graphene is in radio-frequency (RF) applica- 
tions. However, the devices they use require 
specialized fabrication methods to avoid 
damaging the active graphene channel layer. 
Conventional fabrication of practical RF devices 
starts by placing the channel material on a silicon 
substrate and then fabricating other passive 
device components on top of it, using deposition 
steps that can involve high temperatures that 
can degrade the device performance. Han et al. 
report on a fabrication scheme for a graphene 
RF receiver that first assembles the passive ele- 
ments on a silicon substrate. After 
metal lines were fabricated, 
atomic layer deposition was 
used to deposit a thin 
alumina gate 
dielectric 


PLOS One 9, e86485 (2014). 


layer, and the active graphene was grown 
through a chemical vapor deposition process. 
None of the processing steps required tempera- 
tures in excess of 400°C. The final device, which 
contains three GFETs transistors, four inductors, 
two capacitors, and two resistors in a 0.6-mm- 
square area, operates at 4.3 gigahertz and 
could receive and restore digital text transmit- 
ted with that carrier frequency. — PDS 

Nat. Commun. 10.1038/ncomms4086 (2014). 


ATMOSPHERIC SCIENCE 
Early Intervention 


Anthropogenic carbon dioxide emissions from 
the use of fossil fuels may be the most impor- 
tant cause of modern global warming, but it is 
important to remember that humans can affect 
climate in other ways, such as through anthropo- 
genic land cover change (ALCC). Agriculture 
and industrial activities have modified 
more than half of Earth’s natural 
biomes, and ALCC has influenced 
global climate both through 
biogeophysical feedbacks, 
such as modification of the 
exchange of momentum and 
moisture between the land and the 
atmosphere and the alteration of radia- 
tive and heat fluxes; and biogeochemical 
ones, including emissions of greenhouse gases 
and aerosols from biomass burning, deforesta- 
tion, and rice cultivation. He et al. investigate 
how important ALCC has been in the past, by 


using a climate model forced by recently 
compiled observational data to assess how 
ALCC affected climate over the preindustrial 
Holocene. They found that ALCC increased 
global temperatures by around 0.73°C in that 
interval, an amount comparable to the ~0.8°C 
warming that has occurred during industrial 
times. So it seems that early anthropogenic 
activity had a significant impact on climate 
thousands of years before the Industrial 
Revolution began, mostly as a result of the 
greenhouse gas emissions caused by activities 
related to farming, such as deforestation and 
rice cultivation. — HJS 

Geophy. Res. Lett. 41, 10.1002/2013GL058085 

(2014). 


PLANT SCIENCE 
Predicting the Next Generation 


Heterosis, in which the hybrid offspring perform 
better than the inbred parental lines, is a valu- 
able but unpredictable aspect of maize cultiva- 
tion. Traits that are useful in agricultural set- 
tings are often the outcome of complex genetic 
interactions, with many genes influencing each 
other and developmental outcomes in small 
ways. As a result, the genes controlling useful 
traits are often unknown. Nonetheless, crop 
breeders use what information they can find to 
generate more productive maize lines. Feher et 
al. have now used metabolites, the downstream 
output of complex gene suites, to predict het- 
erosis. Looking at the early development of the 
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seedling’s primary root (a well-functioning root 
gets the plant off to a good start) as their end 
point, the authors compared parental metabo- 
lite profiles to those of the hybrid offspring. 
A subset of the metabolites was identified as 
predictive of hybrid outcomes not only for that 
same metabolite but also for several other 
metabolites. The most effective predictions of 
hybrid root biomass were achieved by looking 
at only 5, but any 5, of the most predictive 10 
to 20 metabolites. — PJH 

PLOS One 9, e85435 (2014). 


MICROBIOLOGY 


Ancestor Intercourse 


Trypanosomes (notably including the sleeping 
sickness parasites) have long been thought to be 


primitive protist oddities with strange biochem- 
istries. Recent evidence from Peacock et al. 
shows that, just like the majority of eukaryotes, 
trypanosomes have sex. Starting from observa- 
tions on the expression of meiosis-specific genes 
in trypanosomes within the salivary glands of 
the tsetse fly vector, distinctively shaped cells— 
putative gametes—were found. Subsequently, 
the cells were observed to intertwine flagella, 
squirm, and form intimate pairs. Labeling with 
different-colored fluorescent proteins revealed 
that membrane and cytoplasmic fusion occurred 
(although formal proof is still required for 
nuclear and kinetoplastid DNA exchange), hence 
confirming that even the most ancestral eukary- 
otes indulge in sexual reproduction. — CA 

Curr. Biol. 24, 181 (2014). 


PHYSICS 


A Semisynthetic Lattice 


Atomic vapors at very low temperatures are useful 
for the quantum simulation of solid- state systems, 
because their properties can be finely controlled 
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and tuned. These neutral atoms are not, however, 
completely analogous to the charged carriers in 
solids; for instance, an external magnetic field 
causes electrons to move in circular orbits but 
has no such effects on neutral atoms. Celi et al. 
propose a simple method for creating a uniform 
magnetic flux in a one-dimensional (1D) optical 
lattice that, if realized, might be used to observe 
exotic phenomena such as Hofstadter-butterfly— 
like fractal spectra or the dynamics of topological 
edge states. The method is based on synthetically 
extending the 1D lattice into the second dimen- 
sion of internal atomic states (spin) by coupling 
those states using a pair of Raman laser beams 
that are directed at an angle with respect to the 
optical lattice; the required amount of the Raman 
laser light is substantially smaller than in existing 
schemes. The resulting band structure supports 
edge states in the spin variable whose dynam- 
ics should be observable through spin-sensitive 
density measurements. — JS 

Phys. Rev. Lett. 112, 043001 (2014). 


POLICY: 
Cooperating on Climate 


We've come to expect lack of progress at the an- 
nual United Nations climate talks. A key obstacle 
to agreement is the wealth inequality among 
the countries around the negotiating table. Such 
public-goods negotiations, and the exploitation 
of common resources, are tricky enough on their 
own, but addressing the gap between “haves” 
and “have nots” adds another level of difficulty. 
Building on laboratory experiments and earlier 
theoretical work, Vasconcelos et al. use Evo- 
lutionary Game Theory models to explore how 
wealth inequality and risk perception affect such 
negotiations, and address another key element, 
the homophily of parties; i.e., their tendency 
to align with others from the same wealth 
level. They found that if parties were willing to 
cooperate regardless of wealth levels, then some 
inequality among parties could actually lead to 
better cooperation, as the rich tend to contribute 
more and compensate for lower contributions 
from the poor. Contributions from the poor are 
still critical, though, and increased homophily, 
with limited cooperation across the wealth gap, 
can lead to collapse. Obstinately cooperative 
behavior, with a few poor countries cooperating 
with wealthier countries, can compensate for 
broader homophily. In addition to minimiz- 
ing homophilic biases, the authors suggest that 
negotiations be portioned into smaller groups 
focused on local short-term targets for which 
uncertainty is relatively limited. — BW 

Proc. Natl. Acad. Sci. U.S.A. 10.1073/ 

pnas.1323479111 (2014). 
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Seoul 1 


New Bird Flu Strain Threatens 
Korean Research 


A dangerous new strain of bird flu in South 
Korea has spread nationwide despite efforts 
to clamp down on the virus. Authorities 
have culled 2.8 million domestic chickens 
and ducks since the outbreak began, and the 


Outbreak. Korean health officials remove eggs 
from a duck farm suspected of carrying bird flu. 


strain has also killed dozens of Baikal teal 
and other migratory birds. As yet, there are 
no reports of human infections. Scientists 
are puzzling over where the H5N8 strain, 
never before seen in a highly pathogenic 
form, originated. 

Researchers are now scrambling to keep 
the virus out of the country’s premier poul- 
try research center. A wild goose infected 
with the virus was found dead on | February 
just 10 kilometers from the Suwon campus 
of the National Institute of Animal Science 
(NIAS), near Seoul. The facility houses 
more than 13,000 hens and nearly 5000 
ducks for research on breed improvement 
and animal husbandry. “If the virus infects 
the facility, we would cull all of the poultry,” 
says NIAS’s Yong-sup Song. That would 
put a serious dent in the center’s genetic 
resources and set back ongoing research pro- 
grams. http://scim.ag/avianflu 


Portland, Oregon 2 


First Fish Ready to Swim Off 
Endangered Species List 


A tiny minnow has bounced back from near- 
extinction. The U.S. Fish and Wildlife Ser- 
vice (FWS) says populations of the Oregon 
chub (Oregonichthys crameri) are healthy 
enough to remove the 9-cm-long fish from 
its list of threatened and endangered wild- 
life. Last week’s announcement marks the 
first time an endangered fish has recovered 
enough to be delisted. 

The chub lives in beaver ponds, oxbows, 
and calm streams of the Willamette River 
Valley of western Oregon. After the 1940s, 
populations plummeted from habitat damage 
by logging, pollution, and dams. When the 
fish was listed by FWS in 1993, only nine 
known populations remained. Predation by 
largemouth bass and other non-native fishes 
was the largest threat to the remaining chub. 

Since then, the Oregon Department of 
Fish and Wildlife and other groups started 
20 new populations of the chub in predator- 
free ponds. And dozens of other surviving 
populations have been discovered in the 
wild. Changes to dam management have 
lowered the threat to remaining habitat. 
FWS will accept expert and public comment 
on its proposal to delist until 7 April. 
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Mexico City 3 

Salamander Sightings Prove 
Reports of Extinction Premature 
The axolotl salamander’s only known home 
in the wild, the Xochimilco canals of Mexico 
City, has become increasingly polluted, but 
recent reports of the amphibian’s extinction 
have been (not-so-greatly) exaggerated. 

Two weeks after announcing that months 
of searching the canals hadn’t turned up any 
axolotls, scientists in Mexico City have some 
good news: two of the unique salamanders 
were spotted on 4 February. “There’s been 
an alarming reduction in population den- 
sity,” says Luis Zambrano, a biologist at the 
National Autonomous University of Mexico 
in Mexico City who studies the axolotl. “But 
I can guarantee that [the axolotl] is not yet 
extinct” in the wild. 

Axolotls (Ambystoma mexicanum) 
have long intrigued scientists with their 
odd appearance, their impressive ability to 
regenerate limbs, and the fact that they don’t 
undergo metamorphosis like other salaman- 
ders, instead retaining their feathery gills and 
other tadpolelike features into adulthood. 
Axolotls are popular pets and lab animals, 
but Zambrano says no reintroduction efforts 
with captive populations will be tried until 
scientists are positive the axolotl is “100% 
extinct” in the wild. 


Geneva, Switzerland 4 


CERN to Study Possibility 
Of 100-Kilometer Atom-Smasher 


Scientists at the European particle physics 
laboratory will study plans for a pair of cir- 
cular particle colliders 80 to 100 kilometers 
in circumference, to be built one after the 
other in the same tunnel. The plan would 
depart from the current vision for global 
particle physics, in which the successor to 
CERN’s current 27-kilometer-long Large 
Hadron Collider (LHC) would be a straight 
linear collider that would smash electrons 
into positrons. The LHC smashes protons 
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and in 2012 discovered the Higgs boson. 

In the new scheme, physicists would 
build a somewhat lower energy circular 
electron-positron collider and later a proton 
collider able to reach energies seven times 
as high as the LHC. In reusing the tunnel, 
CERN would repeat a strategy that reduced 
costs during the construction of the LHC. 
The 5-year study is “not about digging 
holes in the ground now or asking govern- 
ments for money,” says CERN spokesman 
James Gillies. “It’s just about considering 
the technology that would be available in 
20 or 30 years.” http://scim.ag/atomsmasher 


NEWSMAKERS 


Three O's 


Each Olympics, there 
is a race between ath- 
letes seeking an artificial 
advantage and the anti- 
doping experts trying to 
catch them. At the Winter Olympics in Sochi, 
a new chemical may be in the mix. Mario 
Thevis, a forensic chemist at the German 
Sport University Cologne, tested a substance 
obtained by German journalists during an 
undercover investigation of a Russian sci- 
entist selling “full size MGF” for $1000 per 
& milligram. Thevis confirmed the sample con- 
tained mechano growth factor (MGF), which 
can prompt muscle growth and is undetect- 
able by current testing methods. 


Q: What is the substance you found in 

the sample? 

§ M.1.: The closest way to describe it is 
human IGF-1 isoform 4. The mRNA of 
isoform 4 is elevated when mechanical 
stress is applied to muscle tissue. We could 
deduce that we were dealing with a highly 
pure and therefore probably highly danger- 
ous substance. 


EWSCOM; LAWRENCE GILB 


/ROBERT EAGLE (3); IMAGO/CAMERA 


Q: What might the side effects be? 

2 M.T.: We don’t know. It could cause any of 

$ the side effects associated with IGF-1, such 
5 as cardiovascular issues. Some of the growth 
> factors also have cancer-causing effects. We 
can’t prove or rule out any of these. 


Science [a 


Join us on Thursday, 20 February, for a live 
chat with experts on assessing the harm 
of drugs for rational drug policy. 
http://scim.ag/science-live 
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Random Sample 


Charles Darwin Gets Busted 


An empty pedestal has become the inspiration for a collection of zany sculptures of Charles 
Darwin. Students and staff from University College London competed to fill a void left by the 
relocation of an original bust of the eminent naturalist. 

The competition's seven entries—on display in a 7-week-long exhibition titled Darwin (or) 
Bust at the Grant Museum of Zoology in London—are based on 3D scans of the original plaster 
bust, which was moved to a newly constructed building on campus. The new designs come 
with a twist. One entry renders Darwin's face as a collage of pages from his seminal book, The 
Origin of Species; another, a likeness molded from a transparent gel through which ants will be 
enticed to tunnel; and a third has a pensive-looking Darwin crocheted out of yarn. The winning 
entry was announced at the exhibit’s grand opening on 12 February, the 205th anniversary 
of Darwin's birth. We wanted people to “get creative, get technical, [and] get messy” while 
reimagining the great man, says Grant Museum curator Mark Carnall. 


__4 


Q: Now that you know this might be in cir- 
culation among athletes, why can’t you test 
for it right away? 

M..: Practically we can, but we have to dem- 
onstrate that our test is fit for the purpose. 
We have to evaluate whether our detection 
limits are in the range for physiological or 
therapeutic amounts, even though we have 
no idea how much that would be. 

Extended interview at http://scim.ag/_sochi. 


FINDINGS 


Invasive Fire Ants 
Meet Their Match 


For 60 years, imported red fire ants have ter- 
rorized people, livestock, and native ants 

as they spread in large numbers across the 
southeastern United States, aggressively 
stinging those who got in the way and proy- 
ing nearly invincible. But another foreigner, 
the invasive tawny crazy ant, can detoxify 
the fire ant’s deadly venom and, for the past 
decade, has been moving into fire ant territo- 
ries, Edward LeBrun, an ecologist at the Uni- 


versity of Texas, Austin, and his colleagues 
report online this week in Science. In Texas, 
LeBrun had noticed unusual grooming- 
like behavior by 
tawny crazy ants 
that fire ants had 
swiped with their 
stingers laden with 
venom drops. He 
discovered that the 
tawny ants were 
wiping themselves 
down with their own 
venom to counter 
the chemical attack. 
It’s unclear how or 
when these ants took 
up this defense, says 
Michael Kaspari, 

an entomologist at 
the University of Oklahoma, Norman, but 
it looks like “this 60-year dynasty of the fire 
ants [in the United States] is coming to a 
close, and it’s coming to a close in a fairly 
unusual way.” http://scim.ag/tawnyants 


Detox. A tawny ant 
reaches for antivenom 
on its abdomen. 
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GENOMES 


Ancient Infant Was Ancestor 
Of Today's Native Americans 


In 1968, when Sarah Anzick was 2 years old, 
a construction worker discovered more than 
100 stone and bone tools on her family’s land 
near Wilsall, Montana. The artifacts were 
blanketed with red ochre, and with them, also 
covered with ochre, was the skull of a young 
child. In the years since, archaeologists 
concluded that the skull was about 12,700 
years old—the oldest known burial in North 
America—and that the tools belonged to the 
Clovis culture, one of the first in the New 
World. Meanwhile, Sarah Anzick grew up, 
became a genome researcher at the National 
Institutes of Health (NIH), and dreamed of 
sequencing the rare bones. 

This week, she is the second author on 
a paper in Nature that reports the complete 
sequence of the Anzick child’s nuclear 
genome. The sequencing effort, led by 
ancient DNA experts Eske Willerslev 
and Morten Rasmussen of the University 
of Copenhagen, comes to a dramatic 
conclusion: The 1- to 2-year-old Clovis child, 
now known to be a boy, is directly ancestral 
to today’s native peoples from Central 
and South America. “Their data are very 
convincing ... that the Clovis Anzick child 
was part of the population that gave rise to 
North, Central, and Southern American 
groups,” says geneticist Connie Mulligan of 
the University of Florida in Gainesville. 

If correct, the findings refute the 
Solutrean hypothesis, which postulates 
that ancient migrants from Western Europe 
founded the Clovis culture (Science, 
16 March 2012, p. 1289). The data also 


undermine contentions that today’s Native 
Americans descend from later migrants to 
the Americas, rather than from the earlier 
Paleoindians. And that could help tribes that 
want to claim and rebury ancient American 
skeletons such as that of the 9400-year-old 
Kennewick Man from Washington state. 
“This is proof that Kennewick Man was 
Native American,” says archaeologist Dennis 
Jenkins of the University of Oregon, Eugene. 
Sarah Anzick, whose family is in possession 
of the infant, says that it is likely to be 
reburied in May. 

Researchers have long wanted to 
examine the DNA of the first Americans 
for clues to their origins. But even after 


Common ancestor? The Montana infant (X) is closely 
related to today’s Native Americans (circles). 


Clovis cache. The child’s skull was found in Montana 
with a host of Clovis tools. 


scientists developed tools to get DNA from 
poorly preserved bones, they lacked the full 
cooperation of today’s Native Americans. The 
Anzick child remained available for study in 
part because it was found on private land, so 
the U.S. Native American Graves Protection 
and Repatriation Act (NAGPRA)—which 
gives native peoples the right to claim and 
rebury many human remains—does not 
apply (Science, 8 October 2010, p. 166). 

Willerslev and colleagues extracted 
DNA from bone fragments taken from 
the child’s skull and one of its ribs, then 
sequenced the genome. They compared 
the genome with those of 143 modern non- 
African populations, including 52 Native 
American ones, in a database compiled over 
several decades by geneticist David Reich 
of Harvard Medical School and others. The 
database includes 45 DNA samples from 
Central and South America and seven from 
Canada and the Arctic, but none from the 
lower 48 states, in part because U.S.-based 
Native American groups have historically 
resisted providing DNA samples, and 
because Reich felt that true informed 
consent was lacking for some samples. 

Despite the North American data gap, the 
team was able to determine that the Anzick 
genome was much more closely related to 
Native Americans than to any other group 
worldwide (see map). The child’s DNA more 
closely resembles that of Central and South 
Americans than Native Americans from the 
far north, although the relationship is still 
very close, Willerslev says. Comparing the 
Anzick genome with that ofa 24,000-year-old 
Siberian boy (Science, 25 October 2013, p. 
409) and a 4000-year-old Paleo-Eskimo from 
Greenland confirms that Native Americans 
originally come from Northeast Asia. 

How to explain the north-south difference? 
The team proposes that an ancestral 
population that lived several thousand years 
before the Clovis period split into two groups, 
one staying north and one going south. Just 
where and when this split happened cannot be 
determined from the genetic data, Willerslev 
and Rasmussen say. The northerners then 
likely mated with peoples who came in later 
from Asia, and so became slightly more 
genetically distant from Anzick. 

The study “is areal technical and analytical 
achievement,” says anthropologist Theodore 
Schurr of the University of Pennsylvania, 
who was not a co-author. It “effectively puts 
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the Solutrean hypothesis to rest,” he says. But 
advocates of that idea take umbrage at such a 
dismissal. “This is a single individual and can 
in no way represent all that was happening,” 
says archaeologist Bruce Bradley of the 
University of Exeter in the United Kingdom. 

Schurr cautions that the lack of U.S.- 
based Native American genomes could 
have biased the analysis of how closely 
related the Anzick boy is to today’s native 
peoples. “The authors might want to be 
more cautious about making such definitive 
statements” about the Clovis culture’s 
ancestral status “without having ... a much 
broader sampling of North American Indian 
populations,” he says. 


ASTROPHYSICS 


Team members say they hope to get more 
US. data. “We hope the continued dialogue 
with local populations and studies like this 
will entice ... Native peoples to participate 
in genetic studies,” Rasmussen says. Shane 
Doyle, a professor of Native American studies 
at Montana State University, Bozeman, and a 
member of the Crow tribe, shares that hope. 
Doyle is coordinating negotiations about 
reburying the child with the Anzick family, 
the researchers, and members of 11 local 
tribal groups, but he sees the value of such 
research for today’s Native Americans. “This 
is absolutely going to change the game about 
how we think about Paleoindians and their 
links to modern-day tribes,” Doyle says. 


NEV 


Both Doyle and Anzick (who notes that she 
is acting for her family, not NIH) say they are 
agonizing over how, and how soon, the child 
should be reburied. They worry that reburial 
will destroy data that might be retrieved years 
from now with better genetic techniques. 
Schurr agrees: “This is why scientists are 
fighting against NAGPRA repatriations of 
Paleoamerican remains, as much can be 
learned from these ancient samples.” 

But Doyle and Anzick insist that the child 
should be reburied out of respect for his 
Native American descendants. “The boy has 
given us an amazing gift,’ Doyle says. “Now 
we must repay that by putting him back where 
he belongs.” —-MICHAEL BALTER 


India Poised to Join Hunt for Gravitational Waves 


NEW DELHI—Every so often in the uni- 
verse, two neutron stars or black holes col- 
lide so violently that space and time them- 
selves shudder. An emerging global network 
of detectors is watching for these ripples in 
spacetime, which are predicted by Albert 
Einstein’s general theory of relativity. Now 
it has a new recruit: India. “India intends to 
host the third detector’ in a U.S.-based array 
known as the Laser Interferometer Gravi- 
tational-Wave Observatory (LIGO), Prime 
Minister Manmohan Singh announced on 
3 February at the Indian Science Congress in 
Jammu. The network’s expansion should help 
physicists pinpoint sources of the waves— 
assuming they can be detected. 

Indian scientists say the government 
is likely to commit $201 million over 
15 years to its facility, LIGO-India. “LIGO 
will bring some of the best international and 
Indian astrophysicists to work on Indian 
soil,” says Ratan Kumar Sinha, a nuclear 
engineer and chair of India’s Atomic Energy 
Commission. The U.S. National Science 
Foundation (NSF) is jazzed as well. “I’m 
very excited about this because the science 
reward is so good,” says F. Fleming Crim, 
NSF’s assistant director for mathematical 
and physical sciences. However, he cautions, 
the Indian government must still develop a 
management structure for the project and 
commit to a schedule and budget. 

To detect gravitational waves, LIGO aims 
to measure the stretching of space itself. Built 
by NSF for $294 million, LIGO comprises two 
exquisitely sensitive optical interferometers 
located 3000 kilometers apart, in Hanford, 
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Washington, and Livingston, Louisiana. Each 
uses laser light to continuously compare the 
relative lengths of two 4-kilometer-long arms 
set at right angles, searching for changes as 
small as 1/10,000 the width of a proton. 

LIGO ran from 2002 to 2010 but didn’t 
spot a signal. Neither have two similar 
interferometers in Europe, GEO600 near 
Hannover, Germany, and VIRGO near Pisa, 
Italy. But so far scientists have surveyed only 
our cosmic neighborhood, which is unlikely 
to harbor a source. By 2015, 
however, a $205 million 
upgrade called Advanced 
LIGO will make both U.S. 
detectors 10 times more 
sensitive—able to probe a 
volume 650 million light- 
years in radius that should 
contain at least a few sources. 

As part of the upgrade, 
LIGO scientists want to extend the global 
network. They are looking for a home for an 
extra set of mirrors and parts that they had 
planned to use in a second interferometer 
at Hanford that would have crosschecked 
the first one. LIGO’s 8-year run indicated 
that crosschecking is not necessary, says 
LIGO Chief Scientist Stanley Whitcomb 
of the California Institute of Technology 
in Pasadena. So in 2010, LIGO proposed 
building a third station in Australia (Science, 
27 August 2010, p. 1003). After the Australian 
government declined that offer in 2011, 
Indian researchers expressed interest. 

LIGO and VIRGO have shared data since 
2007, but an Indian detector would bolster the 
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“I'm very excited 
about this because 
the science reward 


° uM 
is so good. 
—F. FLEMING CRIM, 


network’s ability to pinpoint sources. If the 
two LIGO detectors and VIRGO all picked 
up a signal, researchers could compare its 
arrival times to locate the source in the sky. 
But such triangulation would work poorly for 
sources lying near the plane defined by three 
detectors. A fourth detector lying outside 
that plane would help locate sources over the 
entire sky to within a few degrees. 

Some Indian researchers question 
whether the hefty investment in LIGO-India 
is the best use of their country’s 
science budget; others worry 
that it will be hard to find a 
suitably quiet location for the 
vibration-sensitive facility. 
Astrophysicist Bala Iyer, 
chair of the governing council 
of IndIGO, a consortium of 
Indian gravitational wave 
researchers, dismisses those 
concerns. “The community is very happy,” 
he says, noting that ongoing survey work has 
identified several possible locations. 

Even before LIGO-India comes on, 
perhaps in 2020, the global detector network 
should receive a similar boost from a different 
Asian site. Tunnels should be completed next 
month for the 3-kilometer-long arms of the 
$156 million Kamioka Gravitational Wave 
Detector (KAGRA), which scientists aim to 
fire up in 2017, says Takaaki Kajita, a physicist 
at the University of Tokyo. Of course, he says, 
all five detectors together would do the best 
job—which is why physicists are hoping that 
Singh’s promise soon becomes reality. 

—-PALLAVA BAGLA AND ADRIAN CHO 
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2014 SCIENCE INDICATORS 


New NSF Report Shows Where U.S. Leads and Lags 


It’s easy to lose track of the continued domi- 
nance of the United States on many indicators 
of scientific prowess as national policymakers 
fret about the rise of China and other Asian 
countries. But it shows up clearly in the lat- 
est edition of the National Science Founda- 
tion’s (NSF’s) biennial compendium of global 
trends in science and engineering. 
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The rapid U.S. rebound from the 2008 
recessionis only oneofmanytrendshighlighted 
in Science and Engineering Indicators 2014, 
explains Ray Bowen of the National Science 
Board, the NSF oversight body that issued last 
week’s report. So is the economic payoff from 
the continued investment in science by China 
and its neighbors. 
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The 600-page tome, accompanied by an 
even longer appendix of tables, offers a feast 
for policy wonks. The following tidbits from 
that banquet reveal areas where the United 
States remains preeminent, and other areas in 
which it trails its global competitors. The data 
are from 2012 or the most recent year. 

—JEFFREY MERVIS 


Areas Where the U.S. Is Among the Pack 


S&E Ph.D.s Awarded 


Export of High-Tech Products 


(In billions of dollars) 


Research Intensity: Percent of GDP 
(Selected countries) 


Commercial Clean Energy Investment 
(In billions of dollars) 


Triadic Patents 
(Awarded by all three patent offices, in thousands) 
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ENDANGERED SPECIES 


Science Behind Plan to Ease Wolf 
Protection Is Flawed, Panel Says 


A controversial proposal to lift U.S. govern- 
ment protections for most gray wolves living 
in the lower 48 states suffered a major blow 
last week, when an independent review panel 
nixed the underlying science. The U.S. Fish 
and Wildlife Service (FWS) had argued that 
the gray wolf, which has rebounded in some 
parts of the West, does not need continued 
protection to recover in the East because it 
never lived there. But the four-member panel 
unanimously dismissed that argument, say- 
ing it “does not currently represent the ‘best 
available science.’ ” 

The verdict is widely seen as a game- 
changer. “The service will have to take 
this into account,” says Steven Courtney, 
an ecologist at the National Center for 
Ecological Analysis and Synthesis (NCEAS) 
at the University of California (UC), Santa 
Barbara. Courtney led the NCEAS review, 
which the agency requested. 

The panel’s 7 February report is the latest 
twist in a messy and emotionally fraught 
saga. Wolf researchers estimate that some 
2 million wolves lived in the continental 
United States 600 years ago (see map). After 
being hunted to near extinction, the gray 
wolf (Canis lupus) was placed on the federal 
endangered species list in 1975. It was later 
reintroduced in the Rocky Mountains over 
vehement objections from ranchers and 
others. The population ultimately recovered 
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to some 6000 animals in some western and 
upper midwestern states, and in 2011 the 
federal government lifted wolf protections 
in six of those states, all of which now have 
legal hunts. In June 2013, FWS released its 
proposal to totally remove the gray wolf from 
the endangered species list in every state, 
while adding protections for the Mexican 
gray wolf, a subspecies in the Southwest. It 


Legal hunt. Hunters in Idaho and five other states 
can now kill wolves. 
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Contraction. Gray wolves once 
inhabited a large swath of the 
continental United States, but 
are now confined to a few areas. 


“RT also proposed to recognize 

a new—and controversial— 

species of wolf, Canis lupus 
lycaon, or the eastern wolf, 
which some argue is found today 
in eastern Canada. 
Ina 2012 monograph published 
in-house without review, four FWS 
scientists drew on genetic and 
other evidence to argue that the gray 
wolf had never inhabited the upper 
Midwest and Northeast; instead, only 
the eastern wolf had occupied that 
territory. That scenario, if true, would 
support delisting the gray wolf because it 
means federal officials don’t have a legal 
obligation to try to restore the species to 
22 eastern states. 

But the NCEAS panel, which included 
specialists on wolf genetics, said the idea of 
an eastern wolf is “not universally accepted 
and ... ‘not settled,” ” and rejected the idea 
that the two species had never mixed in 
the East. There’s no question that the gray 
wolf used to be present in the East, too, 
says panelist Paul Wilson, a conservation 
geneticist at Trent University, Peterborough, 
in Canada, who believes that the eastern wolf 
is a separate species. 

It appears the agency was trying to use 
“some kind of taxonomic sleight of hand” 
to support delisting, says panelist Robert 
Wayne, a conservation geneticist at UC 
Los Angeles. That would set a “dangerous 
precedent,” he adds: It would be the first time 
that the federal government has removed a 
species from the list as a result of a taxonomic 
redefinition, and not a population recovery. 

It’s not yet clear how FWS will respond 
to the setback. Agency Director Dan Ashe 
called the NCEAS report “an important 
step” in a statement, but didn’t tip his hand. 
The agency has reopened public comment 
on the delisting proposal until 27 March; so 
far, more than | million people have sent in 
responses, the most in the agency’s history. 

In the meantime, wolves continue to be 
heavily hunted in the states where delisting 
has occurred, with at least 1000 killed in the 
current season. Those states are allowed to 
reduce the number of wolves within their 
borders to 100 animals, or 10 packs with 
10 individuals each. But some scientists 
fear that is too few to maintain healthy 
populations over the long term. 

-VIRGINIA MORELL 
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HISTORY OF SCIENCE 


After More Than 50 Years, a Dispute 
Over Down Syndrome Discovery 


It would have been a personal triumph for 
Marthe Gautier, an 88-year-old pediatric 
cardiologist and scientist living in Paris. On 
31 January, during a meeting in Bordeaux, 
Gautier was to receive a medal for her role in 
the discovery of the cause of Down syndrome 
in the late 1950s. In a speech, she planned to 
tell an audience of younger French geneticists 
her story about the discovery—and how she 
felt the credit she deserved went to a male col- 
league, Jérome Lejeune. 

But Gautier’s talk was canceled just hours 
in advance, and she received the medal a day 
later in a small, private ceremony. The French 
Federation of Human Genetics (FFGH), 
which organized the meeting, decided to 
scrap the event after two bailiffs showed up 
with a court order granting them permission 
to tape Gautier’s speech. They were sent by the 
Jér6me Lejeune Foundation, which wanted 
to have a record of the talk. The foundation, 
which supports research and care for patients 
with genetic intellectual disabilities and 
campaigns against abortion, said it had 
reason to believe Gautier would “tarnish” the 
memory of Lejeune, who died in 1994. 

A brilliant cytogeneticist with a storied 
career, Lejeune has become widely known 
as the scientist who discovered that Down 
syndrome is caused by an extra copy of 
chromosome 21. He received many awards, 
including one from former U.S. President John 
F. Kennedy. But in recent years, Gautier has 


claimed that she did most of the experimental 
work for the discovery. In the French 
newspaper Le Monde, Alain Bernheim, the 
president of the French Society of Human 
Genetics, last week compared her case to that 
of Rosalind Franklin, whose contribution to 
the discovery of the double helix structure of 
DNA in the early 1950s was long overlooked. 
In an e-mail to Science, 
Gautier referred to an 
interview published on the 
Web for her version of events 
more than halfa century ago. 
In it, she explained that she 
worked on Down syndrome 
in the pediatric unit led 
by Raymond Turpin at the 
Armand-Trousseau Hospital 
in Paris, which she joined in 
1956 after a year at Harvard 
Medical School in Boston. 
Human cytogenetics was 
just coming of age. In 1956, a Swedish team 
showed that humans have 46 chromosomes in 
every cell, not 48, as was widely believed. In 
the United States, Gautier had learned to grow 
heart cell cultures, so she proposed to set up 
an advanced cell culture lab and study Down 
syndrome. She says she received her first 
patient sample in May 1958; examing slides, 
she soon noticed an extra chromosome, but 
she was unable to identify it or take pictures 
with her low-power microscope. In June 1958, 


First author. Jéro6me Lejeune, who 
passed away in 1994. 


Claiming credit. Marthe Gautier’s talk at a recent 
genetics meeting in Bordeaux was canceled. 


she “naively” accepted an offer from Lejeune, 
who Gautier says was studying Down 
syndrome using other techniques, to take her 
slides and get them photographed. 

Gautier claims she was “shocked” 
when, after more than 6 months of silence, 
she learned that the discovery was about to 
be published in the journal of the French 
Academy of Sciences, with Lejeune as the 
first author and Turpin the last; Gautier was 
in the middle, her last name misspelled 
as Gauthier. Gautier doesn’t dispute that 
Lejeune identified the 47th chromosome as 
an extra copy of chromosome 21, but she 
maintains that she was the first to notice the 
abnormal count. 

While ackowledging that Gautier played a 
role, the Jéréme Lejeune Foundation claims 
that Lejeune himself made the discovery. “In 
July 1958, during a study of chromosomes 
of a so-called ‘mongoloid’ child, [Lejeune] 
discovered the existence of an extra 
chromosome on the 21st pair,’ according to 
the foundation’s website. The foundation has 
denied that Lejeune appropriated Gautier’s 
discovery; in a press statement, it says a letter 
Turpin sent in October 1958 suggests Gautier 
still hadn’t seen the 47 chromosomes. 

Things came to a head at the meeting in 
Bordeaux. After calling off Gautier’s talk and 
the award ceremony, FFGH issued a statement 
saying it would have been “unacceptable” to 
hold the ceremony under the threat of a legal 
suit. But the federation also 
said it “bitterly regretted” the 
cancellation and condemned 
the use of legal power to 
put pressure on a scientific 
meeting. 

Simone Gilgenkrantz, a 
professor emeritus of human 
genetics at the University 
of Lorraine in France and 
a friend of Gautier’s, says 
the presentation, which she 
has seen, was “completely 
innocuous.” Gautier writes 
in an e-mail to Science that she accepted the 
decision and that she felt unprepared to deal 
with what she calls “an aggression.” “To talk 
under the pressure of justice is not tolerable 
for me or anyone else,” she writes. 

Ideology is fueling some of the rancor. 
Lejeune, a staunch Catholic, was horrified 
by the advent of prenatal diagnostics, which 
made it possible to screen fetuses for Down 
syndrome and other abnormalities, and 
abort those afflicted. He set out to find a 
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therapy for genetic intellectual disabilities 
like Down syndrome, but also campaigned 
tirelessly against abortion—which made 
him a lightning rod among the left wing in 
France. (Lejeune was friends with Pope John 
Paul II and the Vatican is now considering a 
request to beatify him.) In its statement, the 
foundation lashed out at Gautier’s supporters 
for trying to discredit an ideological 
opponent. It said Gautier, at her age, can’t be 
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blamed for her “confusion,” but called stories 
backing her version of events in Le Monde 
and Libération—both left-wing papers— 
“ideological terrorism.” 

Gilgenkrantz, who convinced Gautier to 
tell her story in 2009, says it should be told 
regardless of the politics involved. To her, it’s 
one more tale of a female scientist wronged 
at a time when French science was still very 
sexist. “This is a story that must be known,” 
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she says, “in the name of women.” 

But Bernard Dutrillaux, who worked 
in Lejeune’s lab from the mid-1960s until 
the early 1980s, believes that some score- 
settling may be going on. “Lejeune made 
a lot of enemies” among his peers, he says. 
Still, he condemns the foundation’s legal 
maneuvers. Both sides, Dutrillaux says, 
should know better than to fight such “petty 
rear-guard battles.” -ELISABETH PAIN 


Laser Fusion Shots Take Step Toward Ignition 


As it approaches its fifth birthday, the 
National Ignition Facility (NIF), a troubled 
laser fusion lab in California, has finally 
produced some results that outsiders can get 
enthusiastic about. In a series of experiments 
last year, NIF researchers produced yields 
of energy 10 times greater than achieved 
before and demonstrated the phenomenon 
of self-heating that will be crucial if fusion 
is to reach its ultimate goal of “ignition”—a 
self-sustaining reaction that produces more 
energy than it consumes. “This is a very 


To reach the extreme conditions 
necessary for fusion, NIF relies ona laser the 
size of a football stadium. It produces 192 
ultraviolet beams in a pulse lasting just 15 
nanoseconds that can deliver 1.9 megajoules 
(MJ) of energy, roughly the same as the 
kinetic energy of a 2-ton truck traveling 
at 160 kilometers per hour. The ultraviolet 
beams are converted into x-rays, which then 
compress a fuel capsule, a hollow plastic 
sphere smaller than a peppercorn, containing 
0.17 milligrams of frozen deuterium and 


cool fuel would compress to a higher density 
at the end. The downside was that the slower 
speed allowed the capsule time to break up. 
So they decided to try a pulse that started 
off with a higher power to implode faster and 
ended the pulse sooner, after 15 ns. Although 
such a “high foot” pulse wouldn’t produce 
such high density at the end, the researchers 
hoped it would help control the mixing. 
A laser shot carried out on 13 August last 
year proved them right, with a huge jump 
in energy output. Another two shots, on 27 


significant achievement, and 20 
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it’s a very good place to start for 
going to higher yield,” says Ste- 
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ven Rose of the Centre for Iner- 15/- 
tial Fusion Studies at Imperial 
College London. 

NIE, at Lawrence Livermore 
National Laboratory in Cal- 
ifornia, aims to release enormous 
amounts of energy by fusing 5 

& together nuclei of two isotopes of 
=I hydrogen: deuterium and tritium. 
i Tt heats the nuclei to enormous 
2 temperatures and pressures so 
© that they smash together with 
= enough force to overcome their 
mutual repulsion. 

NIF has struggled to get 
its process to work, however 
& (Science, 21 September 2012, p. 1444). 
& Last year, instead of going flat-out for 
ignition, researchers there adopted a more 
exploratory approach to try to identify the 
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i published this week in Nature and last week 
= in Physical Review Letters, are the first sign 
2 that this approach is working. “It’s a nice 
bi result,” says Robert McCrory, director of 
& the Laboratory for Laser Energetics at the 
2 University of Rochester in New York, who 
6 quickly cautions that NIF is still a long 
aq way from ignition. “People expecting a 
5 breakthrough soon will be disappointed.” 
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Low foot 


Shot number 


On the up. A chart of NIF’s fusion shots since mid-2011 shows that a “high-foot” 
laser pulse has boosted yields. Dates are given as YYMMDD. 


tritium. The aim is to produce a ball of fuel 
with a temperature of 50 million kelvin and 
100 times the density of lead, conditions that 
can spark fusion. 

Researchers realized last year that during 
this implosion, the plastic capsule was 
breaking up and mixing with the fuel, making 
it harder to achieve fusion. So they adjusted the 
timing of the laser pulse. Traditionally, it ran at 
a low power for most of its 20 nanoseconds 
to get the implosion moving without heating 
up the fuel and then finished with a burst of 
high power for the final spark. The rationale 
for this “low foot” approach was that the still- 
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September and 19 November, did 
even better, actually producing 
more energy (14.4 and 17.3 kJ) 
| than was deposited in the fusion 
fuel at the start (11 and 9 kJ), the 
first time that has been achieved 
in a laser fusion experiment. “We 
took a step back from what had 
been tried before and that gave 
4+ usaleap forward,” says NIF team 
leader Omar Hurricane. 

Importantly, the team also saw 
a self-heating phenomenon that 
will be vital for increased yield. 
Fusion reactions produce alpha 
particles (helium nuclei); when 
reactions start in the core of the 
fuel, the alphas help by heating 
the surrounding cooler fuel up 
to reaction temperature. “This is the first 
strong indication of that bootstrap process,” 
Hurricane says. 

However, NIF is far from real gain (more 
energy out than the total input) because so 
much energy is lost converting the laser beams 
to x-rays and training them on the capsule. 
The team’s best shot had a gain of less than 
0.01. But there is general optimism following 
the past year’s progress. “These are the right 
experiments to do,” says Michael Campbell, 
a former NIF director now at Sandia National 
Laboratories. “Who knows how far they can 
take this?” —-DANIEL CLERY 


High foot 
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Production of the vital metal will top out and decline within decades, 
according to a new model that may hold lessons for other resources 


IF ELECTRONS ARE THE LIFEBLOOD OF A 
modern economy, copper makes up its blood 
vessels. In cables, wires, and contacts, cop- 
per is at the core of the electrical distribu- 
tion system, from power stations to the deli- 
cate electronics that can display this page of 
Science. A small car has 20 kilograms of 
copper in everything from its starter motor 
to the radiator; hybrid cars have twice that. 
But even in the face of exponentially rising 
consumption—reaching 17 million metric 
tons in 2012—miners have for 10,000 years 
met the world’s demand for copper. 

But perhaps not for much longer. A group 
of resource specialists has taken the first 
shot at projecting how much more copper 
miners will wring from the planet. In their 
model runs, described this month in the jour- 
nal Resources, Conservation and Recycling, 
production peaks by about midcentury 


even if copper is more abundant than most 
geologists believe. That would drive prices 
sky-high, trigger increased recycling, 
and force inferior substitutes for copper 
on the marketplace. 

Predicting when production of any nat- 
ural resource will peak is fraught with un- 
certainty. Witness the running debate over 
when world oil production will peak (Science, 
3 February 2012, p. 522). And the early recep- 
tion of the copper forecast is mixed. The work 
gives “a pretty good idea that likely we’ll get 
a peak somewhere around midcentury,” says 
industrial ecologist Thomas Graedel of Yale 
University. Technological optimists disagree. 
“Not that it couldn’t happen, but I don’t think 
it’s likely to happen,” says resource econo- 
mist John Tilton, research professor emeritus 
at the Colorado School of Mines in Golden. 
New and better technology for extracting cop- 


per from the earth has always come to the 
rescue before, he notes, so he expects a much- 
delayed peak that businesses and consumers 
will comfortably accommodate by recycling 
more copper and using copper substitutes. 
The copper debate could foreshadow 
others. The team is applying its depletion 
model to other mineral resources, from 
oil to lithium, that also face exponentially 
escalating demands on a depleting resource. 


So far, so good 

The techno-optimists were right about cop- 
per in the past. From nearly nothing in the 
mid-18th century, copper production soared 
along an exponential curve notched only by 
world wars and economic crises. That’s all 
the more impressive considering the accom- 
panying decline in the richness, or grade, 
of the ore being mined. Anyone extracting a 
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Headed down. Miners must move megatons of low- 
grade copper ore at the Zijinshan mine in South China. 


mineral resource goes for the richest, most 
easily mined deposits first, so ore grades ran 
from 10% to 20% copper until late in the 19th 
century and then plummeted to 2% to 3% in 
the early 20th century. Since the mid-1990s, 
the world copper grade has been just below 
1% and in slow decline, according to data 
compiled by resource geologist Gavin Mudd 
of Monash University, Clayton, in Australia, 
ina 2013 paper in the International Journal of 
Sustainable Development. 

Even though the available ores have 
become poorer, forcing miners to claw ever 
greater volumes of rock from kilometer- 
deep open pit mines, the price of copper has 
trended downward since 1900 (with a nota- 
ble spike since 2005 driven by China’s hun- 
ger for raw materials). Multiple factors have 
driven the price decline. Geologists found a 
new type of copper deposit—the porphyry 
ores of buried magma formations—that is 
now the source of most of the world’s copper. 
And lately they have been finding extractable 
porphyry copper faster than it is produced, 
according to Richard Schodde of MinEx 
Consulting in Melbourne, Australia. Equip- 
ment manufacturers have built humongous 
shovels and dump trucks to move huge vol- 
umes of porphyry ore. And chemical engi- 
neers have developed processes such as 
heap leaching—trickling weak sulfuric acid 
through piled crushed ore—to get copper out 
of low-grade ore. 

Even with the technology that is now 
in hand—not what might be developed 
someday—the copper now within reach of 
miners is considerable. At this past October’s 
Geological Society of America meeting, U.S. 
Geological Survey researchers led by geol- 
ogist Jane Hammarstrom of USGS head- 
quarters in Reston, Virginia, reported their 
new assessment of the porphyry copper yet 
to be discovered that could be economically 
mined with current technology. By inferring 
how much copper might be beneath geologi- 
cally likely terrains around the world, the 
group estimated that 2.2 billion metric tons of 
economically extractable metal remain to be 
found. At current rates of production, that’s a 
125-year supply for the world. 


Not so fast 

The world’s copper future is not as rosy as a 
minimum “125-year supply” might suggest, 
however. For one thing, any future world 
will have more people in it, perhaps a third 
more by 2050. And the hope, at least, is that a 
larger proportion of those people will enjoy a 
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higher standard of living, which today means 
a higher consumption of copper per person. 
Sooner or later, world copper production 
will increase until demand cannot be met 
from much-depleted deposits. At that point, 
production will peak and eventually go into 
decline—a pattern seen in the early 1970s 
with U.S. oil production. 

For any resource, the timing of the peak 
depends on a dynamic interplay of geology, 
economics, and technology. But resource 
modeler Steve Mohr of the University of 
Technology, Sydney (UTS), in Australia, 
waded in anyway. For his 2010 disserta- 
tion, he developed a mathematical model for 
projecting production of mineral resources, 
taking account of expected demand and the 
amount thought to be still in the ground. In 
concept, it is much like the Hubbert curves 
drawn for peak oil production, but Mohr’s 
model is the first to be applied to other min- 
eral resources without the assumption that 
supplies are unlimited. 

Now Mohr and Mudd have teamed up 
with resource specialists Stephen Northey of 
Australia’s national research agency CSIRO 
in Clayton, Zhehan Weng of Monash Clay- 
ton, and Damien Giurco of UTS to apply 
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2060 2080 2100 
Heading up, until ... 
The world has been 
producing ever larger 
amounts of copper 

(left, in a purifying 
electrolytic bath), but 
given the planet's finite 
endowment, production 
must peak. A model pro- 
jection has production 
peaking by 2040. 


Mohr’s model to copper. For their study, the 
group drew on a database of the extractable 
copper at all known mine sites that was com- 
piled by Mudd and Weng and published in 
2012. The group assumed that per capita 
demand for mined copper would continue 
to rise at the historical rate of 1.6% per year 
and that the world’s population would grow 
from today’s 7.1 billion people to 10 bil- 
lion in 2100. They taught the model how to 
behave realistically by fitting it to past cop- 
per production for each country and type of 
deposit. In the model, increasing demand 
elicits increased production at existing mines 
and the opening of new mines. 

The model delivers some good news, 
suggesting that production can rise to meet 
expected demand for the next 2 to 3 decades. 
“Tt’s not a story of doom and gloom, of run- 
ning out tomorrow,” Giurco says, “but rather 
of needing to be more mindful of use.” But 
trouble comes in the longer term. With the 
amount of extractable copper in the Mudd and 
Weng compilation, the model shows produc- 
tion peaking just before 2040; after that, cop- 
per can’t be extracted from depleted mines 
any faster, no matter how high the price. 

Increasing the amount of accessible cop- 


723 


724 


4 NEWSFOCUS 


per in the model by 50% to account for what 
might yet be discovered moves the production 
peak back only a few years, to about 2045. It 
just takes a lot of copper to satisfy exponen- 
tially growing demand, Mohr says. In addi- 
tional model runs performed at the request 
of Science, Mohr found that even doubling 
the available extractable copper pushes peak 
production back only to about 2050. And 
quadrupling it—an optimistic projection 
indeed—would mean the world would run 
short of copper by about 2075. 


Copper trouble spots 

So far, so bad—but technological optimists 
are quick to note that human ingenuity has 
confounded the gloom-sayers before. “As 
a society, we have tended to underestimate 
how much copper is out there, and how cre- 


those in the past, he says, “you can’t tell.” 

Furthermore, the models don’t take into 
account constraints on copper mining that 
could make things worse. “The critical issues 
already constraining the copper industry are 
social, environmental, and economic issues,” 
Mudd writes in an e-mail. Any process 
intended to extract a kilogram of metal locked 
in a ton of rock buried hundreds of meters 
down inevitably raises issues of energy and 
water consumption, pollution, and local com- 
munity concerns. And such “environmental 
and societal constraints are getting stronger,” 
Mudd says. 

Mudd has a long list of copper min- 
ing trouble spots. The Reko Diq deposit in 
northwestern Pakistan close to both Iran 
and Afghanistan holds $232 billion of cop- 
per, but it is tantalizingly out of reach, with 


Postpeak options. The price spike at peak copper will drive even 
more recycling of scrap (above). U.S. pennies used to be pure 
copper (far left), but now they are copper-plated zinc; substitu- 
tions in major uses of copper will be far less satisfactory. 


ative society can be about extracting it,’ Tilton 
says. He points out that in the 1970s, USGS 
estimated that about 1.6 billion tons of cop- 
per could be extracted with current technol- 
ogy. Today, the equivalent USGS figure is 
3.1 billion tons. “And it’s very likely to dou- 
ble again,” Tilton says, even without including 
the copper on the ocean floor along midocean 
ridges. “We know the copper’s there—it’s a 
matter of resolving technical problems allow- 
ing extraction,” he says. 

Graedel doesn’t go that far, saying the 
world has been so thoroughly explored for 
copper that most of the big deposits have 
probably already been found. Although 
there will be plenty of discoveries, they will 
likely be on the small side, he says. As for 
technological breakthroughs on a par with 


security problems and conflicts between 
local government and mining companies 
continuing to prevent development. The big 
Panguna mine in Bougainville, Papua New 
Guinea, has been closed for 25 years, ever 
since its social and environmental effects 
sparked a 10-year civil war that left about 
20,000 dead. 

And, looking ahead, on 15 January the 
U.S. Environmental Protection Agency 
issued a study of the potential effects of the 
yet-to-be-proposed Pebble Mine on Bristol 
Bay in southwestern Alaska. Environmental 
groups had already targeted the project, and 
the study gives them plenty of new ammu- 
nition, finding that it would destroy as much 
as 150 kilometers of salmon-supporting 
streams and wipe out more than 2000 hect- 
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ares of wetlands, ponds, and lakes. 

As a crude way of taking account of such 
social and environmental constraints on pro- 
duction, Northey and colleagues reduced 
the amount of copper available for extrac- 
tion in their model by 50%. Then the peak 
that came in the late 2030s falls to the early 
2020s, just a decade away. 


After the peak 

Whenever it comes, the copper peak will bring 
change. Alternative materials can replace cop- 
per in many uses, but substitution in some is 
easier than in others. In 1982, the U.S. cop- 
per penny—at least 88% copper since 1793— 
became 97.5% zinc and just 2.5% copper, 
mostly as copper plating, to discourage people 
from melting down the coins for their copper. 
But Graedel and his Yale colleagues reported 
in a paper published on 2 December 2013 in 
the Proceedings of the National Academy of 
Sciences that copper is one of four metals— 
chromium, manganese, and lead being the 
others—for which “no good substitutes are 
presently available for their major uses.” 

Recycling is more promising. Copper is 
already the third most recycled metal after 
iron and aluminum. Roughly 50% of the 
copper that goes out of service is returned 
to use, Graedel says. Governments could 
increase that figure by requiring product 
designs that, say, made recovery of copper 
wiring from cars easier and less expensive. 
Scarcity-driven price hikes will also boost 
recycling, Graedel notes. 

Copper is far from the only mineral 
resource in a race between depletion— 
which pushes up costs—and new technol- 
ogy, which can increase supply and push costs 
down. Gold production has been flat for the 
past decade despite a soaring price (Science, 
2 March 2012, p. 1038). Much crystal ball— 
gazing has considered the fate of world oil 
production. “Peakists” think the world may be 
at or near the peak now, pointing to the long 
run of $100-a-barrel oil as evidence that the 
squeeze is already on. Mohr’s model is only 
slightly less pessimistic: It forecasts an oil 
peak in 2019, he reported in his dissertation. 

Coal will begin to falter soon after, his 
model suggests, with production most likely 
peaking in 2034. The production of all fossil 
fuels, the bottom line of his dissertation, will 
peak by 2030, according to Mohr’s best esti- 
mate. In the studies Mohr has had a hand in 
publishing, only lithium, the essential element 
of electric and hybrid vehicle batteries, looks 
to offer a sufficient supply through this cen- 
tury. So keep an eye on oil and gold the next 
few years; copper may peak close behind. 

—-RICHARD A. KERR 
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A little extra. A human cell with twice 
the normal number of chromosomes 
(white) attempts to divide. 


Some mammalian cells are loaded with extra sets of chromosomes, 
a state called polyploidy. What on Earth for? 


A dividing cell generally follows a simple 
rule. After duplicating its DNA, the cell splits, 
yielding two daughter cells. That’s why the 
movies of dividing mouse liver cells shot sev- 
eral years ago by Andrew Duncan, then a post- 
doc in Markus Grompe’s group at the Oregon 
Health & Science University in Portland, flab- 
bergasted his lab mates. “We saw a single cell 
giving rise to three and four daughter cells,” 
says Duncan, who is now a tissue biologist 
at the University of Pittsburgh in Pennsylva- 
nia. And though chromosomes normally line 
up neatly across the middle of a cell before it 
divides, the chromosomes in many of the liver 
cells were arranged in unconventional forma- 
tions, including multiple clusters. 

The parental liver cells were forced to 
go through unusual maneuvers because 
they were polyploid, carrying extra sets of 
chromosomes. Polyploidy is rife among 
plants, insects, fish, and some other groups of 
organisms. But most human cells are diploid, 
outfitted with two sets of chromosomes that 
trace back to the set each provided by an egg 
and a sperm. Indeed, extra chromosomes 
usually spell trouble in mammalian cells. 
A few normal cells in people and other 
mammals, however, brim with extra genome 
copies—sometimes as many as a thousand. 
The contortions of the liver cells were 
surprising, but they had long been known to 
have a surfeit of chromosomes—as do cells in 
the heart and bone marrow. 

For decades, researchers have speculated 
about whether polyploidy offers any 
advantages to mammalian cells, such as 
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ramping up protein synthesis, but haven’t 
been able to test their ideas. That has changed 
with the identification of several proteins 
that help regulate polyploidy. By cranking 
cells’ allotment of chromosomes up or down, 
scientists recently have begun to explore the 
possible function of the odd cellular state. Do 
the extra chromosomes simply add bulk to 
cells that need it? Do they give cells reserve 
capacity that enables them to respond to stress 
and damage? “The real unanswered question 
is why any cell type is polyploid,” 
says developmental geneticist 
Robert Duronio of the University 
of North Carolina (UNC), 
Chapel Hill. “We are poised to 
begin answering that question.” 

And even though the mystery 
of polyploidy’s benefits remains 
unsolved, some researchers already hope 
to exploit the phenomenon. They are trying 
to turn polyploidy against certain cancers, 
compelling cells to cease their out-of- 
control division. 


Risky excess 
Polyploidy can seem like “a dangerous 
escapade,” as Duronio and his colleagues 
put it in a 2009 paper. For cells that usually 
get along just fine with two sets of chro- 
mosomes, even one additional chromo- 
some can be disastrous. An extra copy of 
chromosome 21 during development pro- 
duces the disabilities of Down syndrome, 
for instance. 

There’s another potential drawback 
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to polyploidy. “It can drive cancer,” 
says David Pellman, a cell biologist and 
pediatric oncologist at the Dana-Farber 
Cancer Institute in Boston. He points to a 
2013 Nature Genetics paper by Rameen 
Beroukhim, also of Dana-Farber, and 
colleagues that reported duplicated 
genomes in 37% of cancers. Polyploidy 
doesn’t lead to cancer in every case, 
Pellman says, but it’s a big enough risk that 
many cells go to great lengths to thwart it. 
p53, the watchful protein dubbed 
the guardian of the genome, often 
prompts cells with abnormal 
amounts of DNA to commit 
suicide or to curtail division. To 
become polyploid, therefore, 
cells have to disable it and other 
safeguards that protect against 
genome damage, notes biologist Gustavo 
Leone of Ohio State University, Columbus. 

Researchers have gradually acquired a 
good grasp ofthemoleculesandmechanisms 
that make cells polyploid, thanks mainly to 
their work on the cell cycle. A cell’s life cycle 
includes milestones such as DNA dupli- 
cation and division. An intricate network of 
proteins controls the cell’s progress through 
the cycle, pushing it forward or holding it 
back. Under the right circumstances, 
researchers have found, some of these 
proteins steer cells toward polyploidy. 

To tweak the chromosome content of 
cells, several research teams have recently 
genetically engineered mice to make more 
or less of these polyploidy promoters. 
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For example, biochemist Katya Ravid of 
Boston University School of Medicine and 
colleagues enhanced polyploidy to test its 
role in megakaryocytes, hefty immune cells 
that dwell in the bone marrow and generate 
the platelets that help stanch bleeding. 
Megakaryocytes often harbor more than 
100 copies of their genome, and researchers 
conjectured that the extra genes help the cells 
crank out platelets. 

In 2010, Ravid’s team engineered 
mice to manufacture excess amounts of a 
polyploidy-promoting protein. Although 
the alteration boosted the number of 


ber 2013 issue of the Proceedings of the 
National Academy of Sciences. 

Researchers already have evidence from 
other species that extra heft is a benefit of 
polyploidy. In a 2012 study of fruit flies, 
Terry Orr-Weaver of the Massachusetts 
Institute of Technology and her colleague 
Yingdee Unhavaithaya found that when 
they reduced the levels of a polyploidy- 
stimulating protein in cells forming the 
blood-brain barrier in flies, the cells shrank 
and the barrier became leaky. The pair also 
showed that enlarging the undersized cells 
restored a tight seal. Boosting the size of 
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Megakaryocyte Bone marrow 
Hepatocyte Liver 
Trophoblast giant cell Embryo 
Cardiomyocyte Heart 


Bonus DNA. The polyploid cells in mammalian bodies differ in their location, 
. Ina liver cell (right), the 


function, and number of chromosome sets (table) 


chromosome sets the cells contained, it 
didn’t cause a corresponding rise in platelet 
numbers, the team revealed in The Journal 
of Biological Chemistry. Ravid suggests that 
polyploidy instead benefits megakaryocytes 
by boosting production of proteins that the 
cells need for structural support and sticking 
to their neighbors. 


Bulking up 

Biophysical engineer Dennis Discher of the 
University of Pennsylvania School of Medi- 
cine offers another explanation. He suspects 
that polyploidy helps a megakaryocyte in the 
same way a high-calorie diet helps a sumo 
wrestler—by increasing bulk. A membrane 
perforated by small pores separates the bone 
marrow from the bloodstream, and a mega- 
karyocyte has to stay on the bone marrow 
side. Discher and his colleagues recently 
examined what size pores different types of 
bone marrow cells could slip through, and 
they found that megakaryocytes had trouble 
squeezing through even the largest open- 
ings, probably because of their chromosome- 
packed nuclei. “If you ask me why this cell is 
polyploid, I’d say it helps anchor the body of 
the cell in the marrow,” says Discher, whose 
team reported its findings in the 19 Novem- 


Producing blood clotting 


platelets Up reit2e 


Detoxification, 


= apetepe Typically 4 to 16 


Promote implantation Up to 1000 


Contraction Typically 4 


tion for cell division. 


existing cells might cause less disruption 
than producing more cells through division, 
which requires that a cell disengage from its 
neighbors, notes cell biologist Brian Calvi 
of Indiana University, Bloomington. 

Yet for one mammalian cell type that 
takes polyploidy to the extreme, work 
by Leone’s team downplays the size 
connection. Cells in the outer layer of 
embryos, known as trophoblast giant cells, 
are polyploidy champions—in mice they 
pack up to 1000 genome copies. The cells 
help the embryo implant in its mother’s 
womb, and researchers have suggested 
that adding chromosomes allows the cells 
to quickly enlarge, enabling the embryo to 
infiltrate the uterine lining. 

Leone and his colleagues deleted 
genes for polyploidy-promoting proteins 
from trophoblast giant cells in mice, 
anticipating that embryos would die 
because implantation would suffer. “We 
were expecting that polyploidy is really 
significant,” Leone says. Although the giant 
trophoblast cells were smaller than normal 
and carried fewer chromosomes, the mouse 
embryos lived and grew up into seemingly 
healthy adults, the researchers reported in 
Nature Cell Biology in 2012. 


Deep reserves 
For the heart and the liver, two hard-working 
organs that also teem with polyploid cells, 
researchers are exploring a different explana- 
tion for polyploidy: The extra chromosomes 
boost performance under trying conditions 
and increase overall resilience. Indeed, “‘poly- 
ploidy may be an important stress response 
or adaptation” for many cell types, says cell 
biologist Donald Fox of Duke University 
Medical Center in Durham, North Carolina. 
Support for that notion comes from a study 
of the mouse heart, in which almost all the 
cells sport four sets of chromosomes. In 2010, 


multiple chromosome copies (blue) have sorted into three clusters in prepara- 


stem cell biologist Thomas Braun of the Max 
Planck Institute for Heart and Lung Research 
in Bad Nauheim, Germany, and colleagues 
examined genetically altered mice whose 
muscle cells—including those in the heart— 
were missing a gene that spurs polyploidy. 
Although the gene’s absence didn’t make 
all the animals’ heart cells diploid, it did 
reduce the number of chromosome sets they 
contained by about one-third. 

“At baseline conditions, they are pretty 
normal,” Braun says of the mice. However, 
deficiencies appeared when the rodents had to 
cope with setbacks such as a heart attack. The 
hearts of animals with reduced polyploidy 
pumped less blood after an induced heart 
attack than did the hearts of control animals, 
the group reported in Circulation Research. 
How polyploidy enables the heart to rebound 
remains unclear, Braun says. 

The work by Duncan and his colleagues 
on liver cells also backs the stress-response 
hypothesis. Unlike most mammalian organs, 
the liverhas aremarkable ability to regenerate 
after injury. The liver is also well stocked 
with polyploid cells: In humans, about 50% 
of the liver cells called hepatocytes carry 
extra sets of chromosomes. 

Duncan’s team had originally explored 
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whether the polyploid cells within the liver are 
“terminally differentiated,’ meaning that they 
had become mature cells that don’t divide 
and help replenish the organ. So the team 
transplanted polyploid hepatocyte cells into 
mice whose livers had been partially removed. 
“To our great surprise, they regenerated the 
liver perfectly,’ Duncan says. 

That’s when Duncan put the liver cells 
under the microscope, turned on the camera, 
and noticed their unorthodox division style. 
The researchers discovered something else 
unusual about the cells. As the team revealed 
in Nature in 2010, when many of the polyploid 
cells divided, they spawned diploid daughter 
cells. But often these diploid daughters hadn’t 
quite returned to normal—many of them had 
gained or lost an individual chromosome, a 
condition called aneuploidy that is generally 
considered ominous. “Most cancer folks will 
tell you that aneuploidy is synonymous with 
cancer,” Duncan says. 

But some researchers have proposed 
that aneuploidy can create useful genetic 
diversity in a tissue or organ, allowing cells 
to add a copy of a beneficial gene or throw 
out a copy of a detrimental one. And when 
Duncan and colleagues studied an example 
of liver regeneration in mice, they found that 
sites where regrowth occurred were rich in 
aneuploid cells. They have discovered that 
aneuploid cells are abundant in human 
livers, too. 

Duncan now hypothesizes 
that polyploidy in the liver is 
a roundabout way to produce 
aneuploid cells that have 
regenerative properties. His team 
is now working to confirm that 
these cells spur regeneration in 
people suffering from hepatitis 
B, in which a virus devastates the 
liver. Some patients die unless they get a liver 
transplant, but others survive as sections of 
the organ regenerate. The researchers are 
collecting tissue samples to determine if 
areas of the liver that regrow are high in 
aneuploid cells. 

But the idea that polyploidy helps tissues 
regenerate remains a hypothesis, as findings 
from Leone’s group and that of Alain de 
Bruin, a pathologist and veterinarian at 
Utrecht University in the Netherlands, 
emphasize. They genetically engineered mice 
so the animals’ livers lack two polyploidy- 
promoting proteins. “We can generate a 
mouse whose liver is almost entirely [diploid] 
cells,’ De Bruin says. Both teams expected 
that the animals would suffer ill effects. 
Instead, the mice were vigorous, each group 
reported in Nature Cell Biology in 2012, and 
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Double down. The chromosome copies from a polyploid liver cell arranged by size, showing that the cell 


carries four copies of almost every one. 


their livers were no less able to regrow after 
injury. The mice De Bruin and colleagues 
studied, for example, could restore their livers 
after surgical removal of two-thirds of the 
organ. “This polyploidization does not have 
an effect on regeneration or on proliferation 
rate,’ De Bruin says. 

The work from both teams also under- 
mines another older polyploidy hypothesis. 
The liver, De Bruin notes, “is all the time 


“(The real unanswered question is why any 
cell type is polyploid. We are poised to 
begin answering that question. 


—Robert Duronio, University of North Carolina, Chapel Hill 


exposed to toxins.” Hepatocytes work hard 
to detoxify all those noxious substances, 
and some researchers had speculated that 
their extra genetic material could boost the 
output of proteins crucial to this. Yet the mice 
whose livers had reduced polyploidy had no 
problems breaking down toxins, De Bruin’s 
group found. 


Exploiting polyploidy 

Even as they wrestle with mystery of poly- 
ploidy, researchers wonder whether they 
can put what they’ve learned to use. Leu- 
kemia biologist John Crispino of North- 
western University’s Feinberg School of 
Medicine in Chicago, Illinois, and his col- 
leagues have trained their sights on a type 
of acute myeloid leukemia, triggered by 
megakaryocytes, that kills most adults 
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who develop it. Mature megakaryocytes 
don’t divide, but in this form of cancer, the 
cells remain immature, don’t become poly- 
ploid, and replicate prodigiously, causing 
the leukemia. Crispino and colleagues pro- 
pose that forcing the cells to become poly- 
ploid and mature might treat the cancer. 

The team revealed in Cell in 2012 that 
it had identified more than 200 compounds 
that, in lab dishes, spur polyploidy in 
human megakaryocytes. One 
of these molecules, alisertib, 
is already under-going clinical 
trials for several other types of 
cancer—though not because 
of its ability to stimulate 
polyploidy. Crispino’s group is 
now trying to organize an initial 
safety trial of the drug in people 
with acute myeloid leukemia. 

Although polyploidy research has recorded 
some progress in recent years, the field still 
hasn’t nailed down the benefits polyploidy 
provides to different mammal cell types. 
To move forward, Leone says, researchers 
should take a cue from plant biologists, 
who have tested polyploidy’s advantages in 
specific environmental conditions, showing 
that it boosts tolerance for salinity (Science, 
9 August 2013, p. 658). Scientists could 
perform similar studies on liver cells, for 
example, by gauging whether polyploidy 
helps them deal with different diets. Delving 
further into polyploidy’s cellular roles will 
probably produce some surprises, UNC 
Chapel Hill’s Duronio predicts. “There are 
going to be many uses for polyploidy, and 
we are just scratching the surface.” 

—-MITCH LESLIE 
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Global Warming and Winter Weather 


IN MID-JANUARY, A LOBE OF THE POLAR VORTEX SAGGED SOUTHWARD OVER THE CENTRAL 
and eastern United States. All-time low temperature records for the calendar date were set 
at O’Hare Airport in Chicago [—16°F (—8°C), 6 January], at Central Park in New York [4°F 
(—15.6°C), 7 January], and at many other stations (/). Since that event, several substantial snow 
storms have blanketed the East Coast. Some have been touting such stretches of extreme cold 
as evidence that global warming is a hoax, while others have been citing them as evidence that 
global warming is causing a “global weirding” of the weather. In our view, it is neither. 

As climate scientists, we share the prevailing view in our community that human-induced 
global warming is happening and that, without mitigating measures, the Earth will continue to 
warm over the next century with serious consequences. But we consider it unlikely that those 
consequences will include more frigid winters. 

Distinguishing between different kinds of extreme weather events is important because 
the risks of different kinds of events are affected by climate change in different ways. For 
example, a rise in global mean temperature will almost certainly lead to an increase in the 
incidence of record high temperatures. Global warming also leads to increases in atmo- 
spheric water vapor, which increases the likelihood of heavier rainfall events that may cause 
flooding. Rising temperatures over land lead to increased evaporation, which renders crops 
more susceptible to drought. As the atmosphere and oceans warm, sea water expands and 
glaciers and ice sheets melt. In response, global sea-level rises, increasing the threat of 
coastal inundation during storms. 

In contrast to the above examples, the notion that the demise of Arctic sea ice during 
summer should lead to colder winter weather over the United States seems counterintuitive. 
But that is exactly what an influential study has suggested (2). The authors hypothesize that 
global warming could perturb the polar vortex in a manner that renders the flow around it 
more wavy, leading to an increased incidence of both extreme warmth and extreme cold in 


Icy blast. Arctic winds flowed down to North America in January, causing record-breaking cold temperatures. Image 
shows streamlines of wind at the 500 mbar level at 1:00 a.m. Eastern Standard Time on 7 January 2014. Red indi- 
cates faster speeds. 


CREDIT: FIGURE GENERATED BY CAMERON BECCARIO (EARTH. NULLSCHOOL.NET); RESULTS SOURCED FROM THE NCEP/NOAA GLOBAL FORECAST SYSTEM 
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temperate latitudes. It’s an interesting idea, 
but alternative observational analyses and 
simulations with climate models have not 
confirmed the hypothesis, and we do not 
view the theoretical arguments underlying it 
as compelling [see (3—6)]. 

Other studies have suggested that the loss 
of Arctic sea ice may influence the atmo- 
spheric circulation in mid-latitudes dur- 
ing summer [e.g., (7)]. Sea-ice losses dur- 
ing late summer may indeed lead to regional 
changes in Arctic climate [e.g., (5, 8)]. But 
tremendous natural variability occurs in the 
large-scale atmospheric circulation during 
all seasons, and even in summer, the links 
between Arctic warming and mid-latitude 
weather are not supported by other observa- 
tional studies (6). The lag between decreases 
in sea-ice extent during late summer, and 
changes in the mid-latitude atmospheric 
circulation during other seasons (when the 
recent loss of sea ice is much smaller) needs 
to be reconciled with theory. 

Summertime sea-ice extent in the Arctic 
has been remarkably low since 2007, and the 
ensuing years have been marked by some 
notable cold air outbreaks. It was this coin- 
cidence that prompted Francis and Vavrus (2) 
to link the cold air outbreaks to global warm- 
ing. But coincidence does not in itself consti- 
tute a strong case for causality. Cold air out- 
breaks even more severe than occurred this 
winter affected the United States in the early 
1960s, the late 1970s (most notably 1977), 
and in 1983, back when the Arctic sea ice was 
thicker and more extensive than it is today 
[e.g., (9)]. Over the longer time span of 50 
to 100 years, it is well established that there 
has been a decrease in the rate at which low 
temperature records are being set relative to 
all-time high temperature records at stations 
across the United States (/0). For the present 
at least, we believe that statistics based on the 
longer record are more indicative of what the 
future is likely to bring. 

The research linking summertime Arctic 
sea ice with wintertime climate over temper- 
ate latitudes deserves a fair hearing. But to 
make it the centerpiece of the public discourse 
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on global warming is inappropriate and a 
distraction. Even in a warming climate, we 
could experience an extraordinary run of cold 
winters, but harsher winters in future decades 
are not among the most likely nor the most 
serious consequences of global warming. 
JOHN M. WALLACE,** ISAAC M. HELD,? 
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The Big Picture for Big 
Data: Visualization 


TO CONVERT INFORMATION FROM MASSIVE 
data sets into insights, data centers will need 
to support the humans who are trying to make 
sense of it all. Fortunately, innovations in 
information visualization are demonstrating 
that a good user interface is worth a thousand 
petabytes (2013 Visualization Challenge, 
News, 7 February, p. 600). 

When GE Healthcare researcher Nick 
Thomas studied a visualization of the critical 
RBP!1 protein—a genomic carrier of vitamin 
A necessary for reproduction and vision— 
he was surprised by what he saw. Thomas 
scanned the mosaic grid of thousands of red 
and green dots, as well as the linked scatter- 
gram and color-coded plate view. He con- 
firmed expected patterns, but one unexpected 
bright red dot revealed RBP1’s marked influ- 
ence in cellular development. This clue 
gave Thomas an insight that, with statistical 
confirmation, led to an important scientific 
contribution (/). 

Like a growing number of research- 
ers, policy-makers, and interested citizens, 


Thomas was exploring increasingly complex 
data sets by adjusting filters, changing color 
palettes, and choosing novel visualizations 
to search for relationships, clusters, gaps, 
and outliers. Powerful information visual- 
ization tools are realizing famed statistician 
John Tukey’s 50-year-old prediction: “The 
graphical potentialities of the computer...are 
going to be the data analyst’s greatest single 
resource” (2). 

Some Big Data advocates seem to prom- 
ise automatic results with little human par- 
ticipation [e.g., (3, 4)]. A more effective 
approach will be to put human users in con- 
trol, since they can often identify patterns that 
machines cannot. Statistically and algorith- 
mically oriented researchers are increasingly 
recognizing that visual strategies for explor- 
ing complex data lead to more potent and 
meaningful insights. Automated analyses can 
work for well-understood data, but visualiza- 
tions increase the efficacy of experts in fron- 
tier topics, where big breakthroughs happen. 
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Innovation Goes Global 


BY TAKING A U.S.-NATIONAL APPROACH TO 
innovative capabilities and comparing 
the present with the postwar period, W. B. 
Bonvillian (“Advanced manufacturing poli- 
cies and paradigms for innovation,” Policy 
Forum, 6 December 2013, p. 1173) ignores 
the truly transformational change that has 
occurred over the past several decades: the 
growth of the global science system. The 
critical knowledge needed to innovate into 
the next generation of production is increas- 
ingly distributed across the globe, and it is 
just as likely to be located in India or China 
as in Ohio. The Organization for Economic 
Co-Operation and Development reports that 
the growth in the number of triadic patents 
demonstrates the worldwide spread of inno- 
vative activities (/). 

U.S. researchers are actively tapping 
this global resource by collaborating with 
researchers from many other countries. The 
global network of international links (drawn 
from coauthorships on publications) has 
tripled in density over the past 20 years (2), 
with many new members joining the global 
network from developing countries, particu- 
larly China. Chinese addresses now appear 
more frequently than any other country in pub- 


lications coauthored with US. researchers. 
Scientific globalization does not threaten 
an end to US. excellence in innovation; quite 
the opposite. The diffusion and rooting of sci- 
entific capacity to new places provides oppor- 
tunity for greater efficiency in research activ- 
ities, particularly by removing redundancy. 
Creative problem-solving can be enhanced 
by having new entrants grapple with techno- 
logical challenges, as many U.S. companies 
are finding as they invest in foreign research. 
Culturally tied knowledge is often impor- 
tant to market access in foreign countries. 
These goods require a deliberate policy shift 
on the part of U.S. agencies from pushing 
knowledge creation to fomenting knowl- 
edge scanning and integration. Scanning 
the globe for the best new knowledge and 
ensuring local uptake is the more promising 
approach to closing the gaps in U.S. know- 
how than building a U.S.-only R&D effort, as 
Bonvillian suggests. 
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CORRECTIONS AND CLARIFICATIONS 


Perspectives: “Hiding in plain view—An ancient dog in 
the modern world” by H. G. Parker and E. A. Ostrander (24 
January, p. 376). In the figure, the red branch should have 
been labeled “CTVT.” The HTML and PDF versions online 
have been corrected. 


Reports: “Transmissible dog cancer genome reveals 
the origin and history of an ancient cell lineage” by E. 
P. Murchison et al. (24 January, p. 437). In the title, 
“Transmissable” should have been “Transmissible.” The 
HTML and PDF versions online have been corrected. 


Reports: “Identification of a plant receptor for extracellu- 
lar ATP” by J. Choi et al. (17 January, p. 290). The doi should 
be 10.1126/science.343.6168.290. It is correct in the HTML 
and PDF versions online. 


Research Article: “The hidden geometry of complex, 
network-driven contagion phenomena” by D. Brockmann 
and D. Helbing (13 December 2013, p. 1337). In Fig. 2D, 
the label “Zamonia” should have read “Latvia.” The HTML 
and PDF versions online have been corrected. 


Letters to the Editor 


Letters (~300 words) discuss material published in 
Science in the past 3 months or matters of gen- 
eral interest. Letters are not acknowledged upon 


receipt. Whether published in full or in part, Let- 
ters are subject to editing for clarity and space. 
Letters submitted, published, or posted elsewhere, 
in print or online, will be disqualified. To submit a 
Letter, go to www.submit2science.org. 
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DEVELOPMENT 


YseX Is a Matter of Concern Rather 


Than a Matter of Fact 


Amade M'charek 


he self-evidence and power of the X 
| and Y chromosomes in science and 
society cannot be overestimated. As a 
binary couple they line up with other familiar 
biological categories such as eggs and sperm 
or estrogen and androgen, and increasingly 
they’ve come to stand for females and males. 
However, what if one did not take them as a 
matter of fact and instead asked how X and 
Y came to stand for female and male. What 
does it take to sex these chromosomes? Such 
questions refer not so much to the bodies in 
which these chromosomes are found but to 
the scientific practices that study them. Mak- 
ing the work of science visible, demonstrat- 
ing how morals and values are part and parcel 
of the epistemology of science, means under- 
standing the objects of science as “matters of 
concern” (/)—objects that require care and 
deserve density. 

Historian of science Sarah S. Richardson 
(Harvard University) has taken this demon- 
stration as her very task. In her erudite and 
well-balanced Sex Itself, she “examines the 
interaction between cultural gender norms 
and genetic theories of sex from the begin- 
ning of the twentieth century to the pres- 
ent postgenomic age.” Richardson takes 
issue with the perpetual reductionist view 
on sex differences. Perplexed by the sugges- 
tion made in 2005 that genetic differences 
between men and women are larger than 
those between humans and chimpanzees, she 
meticulously demonstrates how the genetics 
of sex has been modeled on alleged and often 
rehearsed gender distinctions between men 
and women. But she does more. Richardson 
skillfully demonstrates how instrumental sex 
differences have been in the development of 
genetics. For example, she shows how the sex 
chromosomes were a key aspect in develop- 
ing the chromosomal theory of inheritance 
and paving the way for experimental studies 
of gene mutations and genomic organization. 
The book includes several case studies that 
help us to understand the history of genetics 
in general. 

Sex Itself consists of four parts. In three 
early chapters, Richardson examines the rec- 
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ognition of sex chromosomes in the early 
20th century and their emergence in the con- 
text of chromosome mapping and the Men- 
delian theory of inheritance. She surveys the 
terminology (such as heterochromosomes, 
accessory chromosome, and sex chromo- 
some) and makes visible the guardedness 
of geneticists to reducing sex 
determination to these chro- 
mosomes. Thomas Hunt Mor- 
gan, from whom she derives 
her title, preferred to say “sex 
factors on the chromosomes.” 
She then discusses how the 
arrival of sex hormones pro- 
vided sex chromosomes with 
a strong ally. They allowed for 
“a simple two-tiered model of 
sexual development, with genes as the initia- 
tors and sex hormones as the dominant agent 
of sexual differentiation.” Richardson argues 
that hormone science thus “helped to consti- 
tute and solidify the ‘sex chromosome.’” 
Richardson opens her second line of argu- 
ment by analyzing the gender stereotypes that 
contributed to the sexing of the chromosomes 
and the identification of the Y with males and 
the X with females. While intelligent men 
such as Brian Sykes and Craig Venter ascribe 
much power to their Y chromosomes, “the 
vessel of manhood” (2), by contrast Richard- 
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son locates that power in a deeply gendered 
history of genetics. To that end, she examines 
a classic case of scientific error, the so-called 
“XYY supermale syndrome” of the 1960s 
and 1970s. Based on the speculation that the Y 
chromosome might contain some male traits, 
XYY males were viewed as having a double 
dose of maleness. Flawed research character- 
ized these men as more sexual and aggressive 
than men who carried a single Y. Thus while 
XX and XY stood for the phenotypic differ- 
ence between females and males, XYY hinted 
at the behavior of males. This case mirrors 
X-mosaicism theories of female biology and 
behavior, which Richardson takes up next. 
X-mosaicism has been cast as revealing 
huge differences between males and females 
and as explaining ascribed behavioral traits of 
women (e.g., complicated, inconsistent, and 
unpredictable). Whereas mosaic X inactiva- 
tion is constantly presented as a necessary 
essential female trait (for example, in stud- 
ies of the incidence of autoimmunity), Rich- 
ardson argues that these studies fail to take 
the biological context into account. Doing 
so reveals that X inactivation does not make 
females more female, but more like males. It 
“serves to equalize X dosage between males 
and females, so that the cells of both sexes are 
functionally monosomic for X-linked genes.” 
In her third argumentative strand, Rich- 
ardson raises important questions of whether 
and how feminist scholarship has contributed 
to the science of sex. The case of sex deter- 
mination proves to be an excellent example. 
Richardson reports on the race to locate the 
male sex-determining gene on the Y chro- 
mosome and the subsequent 
growing dismay that the SRY 
gene may not be in control. 
Rejecting the assumption that 
females were the result of a 
passive sex-determining path- 
way, feminist scientists began 
in the 1990s to critique the 
notion of the “master gene” 
as well as Y chromosome— 
centered research in sex deter- 
mination. Jennifer Graves and other leading 
researchers argued that a gendered view such 
as the “dominant Y” with masculine quali- 
ties had geneticists believing that SRY is an 
activator and ignoring the fact that it could 
be an inhibitor or “a spoiler that turns off 
genes” (3). By the early 2000s, the notion of 
the master gene faded away to make room 
for the complexity of sex determination and 
the view that both male and female pathways 
played active roles therein. As Richardson 
argues, these shifts cannot be fully explained 
by feminist interventions, but gender criti- 
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cism did actively interact with the research on 
sex determination. 

In a contrasting case of the “vanishing Y” 
and the debate over whether the Y chromo- 
some is gene-rich or almost dying out, Rich- 
ardson zooms in on the ways gendered notions 
do not merely play a role in media debate. 
They are part and parcel of the research ques- 
tions, design, and knowledge that comes out 
of laboratories. As there is no way of being 
gender neutral, we had better acknowledge 
and critically reflect on gender in genetics. 


In the concluding chapters, Richardson 
attends to the genomic era and its effect on sex 
science. She offers reflections on the previous 
cases and prognoses focused on potential risks 
involved in the genomization of sex research. 
Like many scholars studying the social aspects 
of genomics, she voices concerns about the 
ways genomic science is reifying differences 
among people. She quite correctly notes that 
these concerns have been especially attended 
to in studies of the implications of genomics 
on race and racism, whereas little ink has been 


spilled on the reification of sex differences. 
Thus the urgent need for Sex Itself. Not simply 
an account of the effect of gender on genetics, 
it provides us with tools to think of the possi- 
bility of a gender-critical genetics. 
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ENVIRONMENTAL LAW 


The Case for a Public “Trust” 


Rena Steinzor 


eaven only knows it is way past time 
H: us to stop talking about whether 

climate change is the gravest threat 
that has ever confronted humankind while 
the shrill voices ofa tiny minority insist noth- 
ing is wrong. Instead, the governments of all 
countries should be engaged in the most seri- 
ous possible debate over how to mitigate and 
adapt to the effects humans have had on cli- 
mate. The reasons that we have not reached 
this crucial phase are complicated. At the 
very least, they involve the myopic entitle- 
ment of developed nations and developing 
nations’ desperate efforts to catch up. Our 
paralysis arises from the limits of our psy- 
chology and the constraints of our political 
systems; both seem entrenched in their com- 
mitment to preserving stability by maintain- 
ing the status quo. To the extent, though, that 
the United States legal system in and of itself 
frustrates an effective response, Mary Chris- 
tina Wood’s Nature s Trust makes a discrete 
contribution to the search for climate change 
solutions by rejecting dominant paradigms 
out of hand. Instead Wood (University of Ore- 
gon School of Law) urges the courts to pick 
up the isolated, tenuous threads of the “pub- 
lic trust” doctrine and use them to compel the 
executive and legislative branches to embrace 
the idea that all natural resources (including 
Earth’s atmosphere) cannot be used in any 
way that exacerbates climate change. 

The public trust, envisioned as the pooled 
ownership over natural resources possessed 
by all the country’s citizens regardless of pre- 
vious concepts of private property, would 
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reject any land use that does not preserve the 
ability of nature to replenish itself. The doc- 
trine would drive private rights into the back- 
ground, supplanting them on the rationale 
that cutting back human production of green- 
house gases is of overriding importance and 
that nature does not have the capacity to safely 
absorb the quantities we have already emitted. 

The nature’s trust is the antithesis of what 
the author dubs “predatory capitalism.” If, as 
Wood argues, Congress, the 
Executive Branch, polluting 
industries, and national envi- 
ronmental groups are hope- 
lessly corrupt, the advance- 
ment of the doctrine would 
depend on a widespread grass- 
roots rebellion and the indepen- 
dent thinking of the courts. The 
author is optimistic about the 
development of a mass move- 
ment because “assertions of 
commonwealth thinking now appear across 
the United States” in the form of “commu- 
nity gardens, inner-city farms, and urban 
homesteads.” She acknowledges that it will 
take “enormous numbers of citizens to grow 
these seeds of change into a land revolution 
so strong that it displaces the market-driven 
system of land exploitation.” But, she assures 
us, “those who truly cherish private property 
rights will find their calling in this land-as- 
commonwealth frame, as they will come to 
learn that their liberty and quiet enjoyment of 
land depends, first and foremost, on Earth’s 
life-sustaining ecological endowment.” 

I must confess to wondering about the peo- 
ple who are unlikely to renounce private prop- 
erty: those grown wealthy and comfortable 
on the basis of land exploitation. Presumably, 
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they will be overwhelmed and then reformed. 
by everyone else, including those who have 
never felt secure enough to own anything. But 
because Wood is a zealot in the best sense of 
the word—possessing enormous energy, pas- 
sion, and conviction that she has discovered 
the one true path forward—she does not dwell 
for long on the efforts the former will make to 
resist her. Nor does she acknowledge in any 
realistic way the privations we would endure 
to make the radical transition she envisions. Of 
course, she might respond to these critiques 
by asking whether the specter of unmitigated 
climate change is acceptable—and, of course, 
it is not. Yet the polarization of these alterna- 
tives undermines the book’s credibility. How 
a mass movement would be sustainable if 
fed only by Wood’s ideal- 
ism, without preparing for the 
sacrifices that are inevitable, is 
far from clear. 

Nature s Trust is hefty, run- 
ning over 450 pages, and its 
primary value will be to law- 
yers who need a compendium 
of legal precedents to help 
them formulate test cases. 
A second audience might be 
composed of people who are 
not quite convinced about the vagaries of 
government regarding this issue. Wood effec- 
tively stokes rage against those in power—a 
white-hot phenomenon akin to road rage, at 
least with respect to Congress and the expan- 
sive Executive Branch. Oddly, she seems to 
have more confidence in the courts to adopt 
the nature’s trust position, even though fed- 
eral judges are not elected by anyone, includ- 
ing the enormous numbers of citizens she 
hopes will see the light. 

Nonetheless, as jacket blurbs by Bill 
McKibben, James Hansen, and Ross Gelb- 
span express quite well, Nature s Trust is both 
ambitious and original. For anyone interested 
in using the legal system to prod action, Wood 
has made a major contribution. 
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Methane Leaks from North 
American Natural Gas Systems 
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atural gas (NG) is a potential “bridge 

fuel” during transition to a decarbon- 

ized energy system: It emits less car- 
bon dioxide during combustion than other fos- 
sil fuels and can be used in many industries. 
However, because of the high global warming 
potential of methane (CH,, the major compo- 
nent of NG), climate benefits from NG use 
depend on system leakage rates. Some recent 
estimates of leakage have challenged the ben- 
efits of switching from coal to NG, a large 
near-term greenhouse gas (GHG) reduction 
opportunity (/—3). Also, global atmospheric 
CH, concentrations are on the rise, with the 
causes still poorly understood (4). 

To improve understanding of leakage 
rates for policy-makers, investors, and other 
decision-makers, we review 20 years of tech- 
nical literature on NG emissions in the United 
States and Canada [see supplementary mate- 
rials (SM) for details]. We find (i) measure- 
ments at all scales show that official inven- 
tories consistently underestimate actual CH, 
emissions, with the NG and oil sectors as 
important contributors; (ii) many indepen- 
dent experiments suggest that a small number 
of “superemitters” could be responsible for a 
large fraction of leakage; (iii) recent regional 
atmospheric studies with very high emissions 
rates are unlikely to be representative of typi- 
cal NG system leakage rates; and (iv) assess- 
ments using 100-year impact indicators show 
system-wide leakage is unlikely to be large 
enough to negate climate benefits of coal-to- 
NG substitution. 
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Underestimation—Device to Continent 

This study presents a first effort to system- 
atically compare published CH, emissions 
estimates at scales ranging from device- 
level (>10° g/year) to continental-scale 
atmospheric studies (>10' g/year). Studies 
known to us that (i) report measurement- 
based emissions estimates and (ii) compare 
those estimates with inventories or estab- 
lished emission factors (EFs) are shown in 
the first chart. 
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Methane emissions from U.S. and Canadian 
natural gas systems appear larger than official 
estimates. 


Studies that measure emissions directly 
from devices or facilities (“bottom-up” stud- 
ies) typically compare results to emissions 
factors (EFs; e.g., emissions per device). 
Large-scale inventories are created by multi- 
plying EFs by activity factors (e.g., number 
of devices). 

Studies that estimate emissions after 
atmospheric mixing occurs (“atmospheric” 
studies) typically compare measurements to 
emissions inventories, such as the U.S. Envi- 
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Inventories and emissions factors consistently underestimate actual measured CH, emissions across 
scales. Ratios >1 indicate measured emissions are larger than expected from EFs or inventory. Main graph 
compares results to the EF or inventory estimate chosen by each study author. Inset compares results to 
regionally scaled common denominator (17), scaled to region of study and (in some cases) the sector under 
examination. Multiple points for each study correspond to different device classes or different cases mea- 
sured in a single study. Definitions of error bar bounds vary between studies. (US, United States; Can, Canada; 
SC, South Central; Petrol. and Pet., petroleum; SoCAB, South Coast Air Basin; LA, Los Angeles; DJ, Denver- 
Julesberg; UT, Utah; HF, hydraulic fracturing). See SM for figure construction details. 
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ronmental Protection Agency (EPA) national 
GHG inventory (GHGI). Atmospheric stud- 
ies use aircraft (7, 5—8), tower (3, 6), and 
ground (3, 7—/0) sampling, as well as remote 
sensing (7, //, 12). All such studies observe 
atmospheric concentrations and must infer 
fluxes by accounting for atmospheric trans- 
port. The various inference methods have 
strengths and weaknesses (see SM). The 
greatest challenge for atmospheric studies 
is attributing observed CH, concentrations 
to multiple potential sources (both anthropo- 
genic and natural). 

Results from bottom-up studies (gener- 
ally <10° g CH,/year) and atmospheric CH, 
studies at regional scale and larger (above 
10!° g CH,/year) are shown in the first chart. 
We also include studies that do not focus on 
NG systems, in order to place NG emissions 
in context with other CH, sources. Across 
years, scales, and methods, atmospheric 
studies systematically find larger CH, emis- 
sions than predicted by inventories. EFs were 
also found to underestimate bottom-up mea- 
sured emissions, yet emissions ratios for bot- 
tom-up studies are more scattered than those 
observed in atmospheric studies (/3—/6). 

Regional and multistate studies focusing 
on NG-producing (/—3, 9) and NG-consum- 
ing regions (2, 7, 10—12) find larger excess 
CH, emissions than national-scale stud- 
ies. This may be due to averaging effects of 
continental-scale atmospheric processes, 
to regional atmospheric studies focusing 
on areas with other air quality problems (/, 
3), or simply to methodological variation. 
Atmospheric measurements are constrained 
in spatial and temporal density: Regional 
studies cover 0.5 to 5% of NG production 
or consumption with dense measurements, 
although often limited to short-duration sam- 
pling “campaigns” (3, 7); national studies 
cover wide areas with limited sample density 
(6) (table S5). 

To facilitate comparison, the inset in the 
first chart normalizes atmospheric studies 
(>10" g CH,/year) to baselines computed 
from the most recent (2011) EPA GHGI esti- 
mates for the year and region in which study 
measurements were made (/7). After nor- 
malization, the largest (e.g., national-scale) 
atmospheric studies (>10'? g CH,/year) sug- 
gest typical measured emissions ~1.5 times 
those in the GHGI (5, 6, 8, 9). 

Why might emissions inventories be 
underpredicting what is observed in the 
atmosphere? Current inventory methods rely 
on key assumptions that are not generally sat- 
isfied. First, devices sampled are not likely 
to be representative of current technologies 
and practices (/8). Production techniques 
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Potential contributions to total U.S. CH, emissions above EPA estimates. EPA estimate in blue, based 
on central estimate and uncertainty range from large-scale studies from the inset in the first chart. Both NG 
sources and possible confounding sectors are included. NG production, petroleum production, and NG dis- 
tribution emissions are based on regional empirical studies (1, 2, 6), which estimate emissions rates from 
high-emitting sources but do not estimate prevalence. Scenarios (a) to (c) correspond to 1, 10, and 25% of 
gas production or consumption from such high-emitting sources. Ranges (d) to (g) correspond to estimates 
for flowback emissions rates during hydraulic fracturing (HF) of all gas wells and shale gas wells, relative to 
EPA estimates. Ranges (/) to (m) reflect sources not included in EPA CH, inventories but which could be mis- 
taken for NG emissions by chemical or isotopic composition. See SM for details. 


are being applied at scale (e.g., hydraulic 
fracturing and horizontal drilling) that were 
not widely used during sampling in the early 
1990s, which underlies EPA EFs (/8). 

Second, measurements for generating EFs 
are expensive, which limits sample sizes and 
representativeness. Many EPA EFs have wide 
confidence intervals (79, 20). And there are 
reasons to suspect sampling bias in EFs, as 
sampling has occurred at self-selected coop- 
erating facilities. 

Third, if emissions distributions have 
“heavy tails” (e.g., more high-emissions 
sources than would be expected in a normal 
distribution), small sample sizes are likely to 
underrepresent high-consequence emissions 
sources. Studies suggest that emissions are 
dominated by a small fraction of “superemit- 
ter” sources at well sites, gas-processing 
plants, coproduced liquids storage tanks, 
transmission compressor stations, and dis- 
tribution systems (see table S6 and fig. S2). 
For example, one study measured ~75,000 
components and found that 58% of emissions 
came from 0.06% of possible sources (21). 

Last, activity and device counts used in 
inventories are contradictory, incomplete, 
and of unknown representativeness (1/7, 22). 
Data should improve with increased report- 
ing requirements enacted by EPA (23, 24). 


Source Attribution in Atmospheric Studies 

Does evidence suggest possible sources of 
excess CH, emissions relative to official 
estimates within the NG sector? A key chal- 
lenge is attribution of atmospheric observa- 
tions to sources. Isotopic ratios (7, //) and 
prevalence signatures of non-CH, hydrocar- 
bons (3, 6-8) can be used to attribute emis- 


sions to fossil sources rather than biogenic 
sources. Evidence from regional studies sug- 
gests that CH, emissions with fossil signa- 
tures are larger than expected (3, 6, 7, 9, 71), 
whereas national-scale evidence suggests a 
mix of biogenic and fossil sources (6). Atmo- 
spheric studies that control for biogenic CH, 
sources (/, 2, 7) are dependent on biogenic 
source estimation methods that also have 
high uncertainties (6). Natural geologic seeps 
could confound attribution (see the second 
chart and SM). 

Studies can attribute emissions to liquid 
petroleum and NG sources rather than coal 
by sampling in places with little coal-sector 
activity (2, 3, 6, 7, 9). Attributing leakage 
to the NG system, as defined by EPA indus- 
try sector classifications, is more challeng- 
ing. Alkane fingerprints may allow attribu- 
tion to oil-associated NG (9), although NG 
processing changes gas composition, which 
may complicate efforts to pinpoint leakage 
sources. Geographic colocation of facilities 
and sampling, along with geographically 
isolating wind directions (2, 3, 7), can allow 
attribution of emissions to NG subsectors. 
Without spatial isolation, sector attribution 
can require assumptions about gas composi- 
tion that introduce significant uncertainty (2, 
3, 25). 

We plotted results ofa thought experiment 
(see the second chart) in which we estimated 
emissions ranges of selected possible sources 
within the NG sector, as well as sources that 
could be mistaken for NG emissions owing to 
chemical and isotopic signatures. Although 
such an analysis is speculative given current 
knowledge, it illustrates ranges of possible 
source magnitudes. 
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We include in the second chart a range 
of excess CH, from all sources (7 to 21 x 
10” g or Tg/year) based on normalized 
national-scale atmospheric studies from the 
inset in the first chart. This excess is conser- 
vatively defined as 1.25 to 1.75 times EPA 
GHGI estimates. This estimate is derived 
from national-scale atmospheric studies and 
includes all sources of CH, emissions: It 
should not be expected that NG sources are 
responsible for all excess CH,. 

The scenarios in the second chart for 
NG production and/or processing, distribu- 
tion, and petroleum system emissions apply 
observed leakage rates from the literature 
that are higher than EPA GHGI estimates (/, 
2, 7). The frequency of such high-emitting 
practices is unknown, so illustrative preva- 
lence scenarios are plotted: 1, 10, or 25% 
of activity is represented by high-emitters; 
the remaining facilities emit at EPA GHGI 
rates. This evidence suggests that high leak- 
age rates found in recent studies (/, 2, 7) are 
unlikely to be representative of the entire 
NG industry; if this were the case, associ- 
ated emissions would exceed observed total 
excess atmospheric CH, from all sources. 

In general, the wide ranges in the sec- 
ond chart suggest a poor understanding of 
sources of excess CH, and point to areas 
where improved science would reduce 
uncertainty. However, hydraulic fracturing 
for NG is unlikely to be a dominant con- 
tributor to total emissions (26). Also, some 
sources not included in the GHGI may con- 
tribute to measured excess CHy,, e.g., aban- 
doned oil and gas wells and geologic seeps 
(see SM). 


Policy Challenges and Opportunities 
Leakage scenarios in the second chart 
have implications for decision-making 
and policy. A key tool for environmental 
decision-making is life-cycle assessment 
(LCA), which compares impacts associated 
with varying methods of supplying a use- 
ful product (e.g., kWh of electricity). A key 
challenge in LCA studies is attribution of 
emissions from systems that produce two 
products, such as “gas” wells that also pro- 
duce hydrocarbon liquids, or “oil” wells that 
also produce NG. This challenge is compli- 
cated by incongruence between LCA meth- 
odology and EPA sector definitions (see SM). 
Recent LCAs have estimated GHG emis- 
sions from NG use in power generation and 
transport (see SM). LCA studies generally 
agree that replacing coal with NG has cli- 
mate benefits (27). However, LCAs have 
relied heavily on EPA GHGI results. Updat- 
ing these assessments with uncertainty 


www.sciencemag.org SCIENCE 


ranges from the second chart (see SM) still 
supports robust climate benefits from NG 
substitution for coal in the power sector 
over the typical 100-year assessment period. 
However, climate benefits from vehicle fuel 
substitution are uncertain (gasoline, light- 
duty) or improbable (diesel, heavy-duty) 
(28). These conclusions may undercount 
benefits of NG, as both EPA GHGI methods 
and many regionally focused top-down stud- 
ies attribute CH, emissions from coproduc- 
ing NG systems to the NG sector, rather than 
to a mixture of oil and NG sources. 

How can management and policy help 
address the leakage problem? Opportunities 
abound: Many solutions are economically 
profitable at moderate NG prices, with some 
technologies already being adopted or to be 
required in regulation (23, 26) (e.g., reduced 
emissions completions). Facility studies 
using existing technology have found leak- 
age detection and repair programs to be 
profitable (2/). 

The heavy-tailed distribution of observed 
emissions rates presents an opportunity for 
large mitigation benefits if scientists and 
engineers can develop reliable (possibly 
remote) methods to rapidly identify and fix 
the small fraction of high-emitting sources. 

However, this heterogeneity also creates 
challenges in formulating statistical distri- 
butions for use in inventories. Approaches 
that assume “typical” emissions rates for 
this industry are inherently challenged. 
Inventories can be improved through efforts 
to better characterize distributions and by 
incorporating flexibility to adapt to new 
knowledge. 

Improved science would aid in generat- 
ing cost-effective policy responses. Given 
the cost of direct measurements, emis- 
sions inventories will remain useful for 
tracking trends, highlighting sources with 
large potential for reductions, and making 
policy decisions. However, improved 
inventory validation is crucial to ensure 
that supplied information is timely and 
accurate. Device-level measurements can 
be performed at facilities of a variety of 
designs, vintages, and management practices 
to find low-cost mitigation options. These 
studies must be paired with additional atmo- 
spheric science to close the gap between top- 
down and bottom-up studies. One such large 
study is under way (29), but more work is 
required. 

Ifnatural gas is to be a “bridge” to a more 
sustainable energy future, it is a bridge that 
must be traversed carefully: Diligence will 
be required to ensure that leakage rates are 
low enough to achieve sustainability goals. 
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DEVELOPMENTAL BIOLOGY 


Self-Organizing Somites 


Shigeru Kondo 


uring vertebrate embryo- 
genesis, the body plan is 
established through the 


organization of blocks of tissue 
(somites) that form along either 
side of the neural tube, the precur- 
sor to the adult spinal cord (see the 
figure). How a regularly arranged 
pattern of somites arises from the 
patternless presomitic mesoderm 
has not been clear, but the “clock 
and wavefront” model of oscillat- 
ing gene expression (the clock) 
and a traveling wave of signals to 
stop the oscillation, put forth in 
1976 (1), has been the most widely 
accepted theory. On page 791 of 
this issue, Dias et al. (2) try to over- 
turn this model by showing that 
somites can form without either 
oscillations or the wave. 

A somite is a spherical ball 
with a lumen surrounding a vari- 
able amount of mesodermal cells 
(depending on the species); the 
mesodermal cells lie beneath a 
surface layer of epithelial cells 
(also of a variable thickness). 
According to the clock and wave- 
front model and the experiments supporting 
it (7, 3-5), what determines the size of each 
somite is the combination of local oscilla- 
tion in the expression of a network of genes 
in presomitic mesodermal cells and their 
exposure to a wave of extracellular signal(s) 
that travels from anterior to posterior in the 
presomitic mesoderm. The oscillation per- 
sists over the period of time required for a 
somite to form, which varies across species 
(for example, it is 30 min for zebrafish but 
90 min for the chick). It stops when a cell 
meets a wavefront of a specific signal [such 
as the reduction of fibroblast growth factor 
(FGF)] that triggers mesodermal cells to 
transition and become epithelial cells. The 
somite boundary is thus defined by those 
cells in which the oscillation has stopped 
at a specific phase. As the wavefront con- 
tinuously travels from anterior to posterior, 
somite boundaries are made sequentially 
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with even spacing. The somite size is deter- 
mined by the ratio of the oscillation period 
to the speed of the traveling wave. 

Dias et al. observed something different. 
The authors found that presomitic mesoder- 
mal cells removed from an early-stage chick 
(or quail) embryo, incubated with the mole- 
cule Noggin (for 3 hours) and then implanted 
into the extraembryonic region, could simul- 
taneously form up to 14 somite-like struc- 
tures that collectively resemble a “bunch of 
grapes.” Noggin restricts the differentiation 
of mesodermal cells into somitic cells in the 
developing chick. Each somite-like struc- 
ture consisted of epithelial cells surround- 
ing a lumen and was of a size consistent with 
somites found in the chick embryo. They only 
lacked a rostral-caudal subdivision. When the 
somite-like structures were implanted in the 
position of normal somites in a host chick 
embryo, they developed the tissues that are 
normally derived from somite cells. Oscil- 
lation of segmentation gene expression was 
not observed; because the structures were 
implanted outside of the presomitic meso- 
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The size and position of tissue during 
vertebrate body plan development may rely 
on the cooperation of mechanisms that act 
globally and locally. 


Neural 
Vs tube 
/ 


/ 


Neural 
~ plate 


Somitogenesis. (A) In the vertebrate embryo, blocks of tissue called somites form along the body axis. (B) Somites arise 
from presomitic mesoderm. Local cell-autonomous mechanisms may determine the size of a somite, whereas a global 
mechanism such as the “clock and wavefront” may determine the absolute position of each somite along the body axis. 
The clock and wavefront model proposes that oscillations of gene expression and waves of extracellular signal control the 
regularly arranged array of somites. 


derm, they were not exposed to a wavefront 
of signal that stops the oscillatory clock. 
This finding of Dias et al. indicates that 
presomitic mesodermal cells can self-orga- 
nize the somite structure without the indica- 
tion of the segment position by the clock and 
wavefront mechanism. Rather, local interac- 
tions between cells appear to control the mor- 
phology ofa somite, including its correct size. 
It remains to be examined whether the 
presomitic mesoderm of mice or zebrafish 
has the same property. But the experiment 
of Dias et al. is quite simple and the result 
is dramatic, suggesting that the phenome- 
non could be universal among vertebrates. In 
the mouse and zebrafish (6, 7), the anterior- 
to-posterior length of the somites decreases 
when the period of segment gene oscillation 
is shortened. This points to direct involve- 
ment of the oscillation period in somite size 
determination. However, the finding of Dias 
et al. does not necessarily contradict coopera- 
tion between global (clock and wavefront) and 
local (cell-cell interaction) mechanisms to 
control somitogenesis. The size of a structure 
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that is made by a cell-autonomous mechanism 
is usually influenced considerably by local 
environmental perturbations. Because the size 
change of somites observed in the mouse and 
zebrafish studies was not so drastic and was 
within the flexibility of a local autonomous 
mechanism, the existence of a local autono- 
mous mechanism cannot be ruled out. If the 
size change was more extensive (greater than 
200% or less than 50%), then the possibility 
of a local mechanism would be quite low. 
The cooperation of the two different 
mechanisms offers many advantages. A local 
cell-autonomous mechanism can determine 
the size of a somite but cannot determine the 
absolute position of each somite. To form the 
regularly arranged array of somites, some 
global mechanism like the clock and wave- 
front mechanism is surely required. The clock 
and wavefront model does have some weak 


points, too. For example, it does not work for 
the most cranial four or five somites because 
they arise simultaneously (8). Another prob- 
lem concerns the precision of the positional 
information given by the wavefront. Tempo- 
rally, the concentration gradient of FGF is 
thought to act as the wavefront (in the chick, 
mouse, and zebrafish) (5). However, the slope 
of the gradient seems too gentle to indicate 
the precise timing to the oscillating cells. The 
model also cannot specify somite size along 
either the dorsoventral or the lateral axis. By 
incorporating a local autonomous mecha- 
nism to determine the somite size, these 
weaknesses are removed. 

Although the cooperation of a global and 
local mechanism is possible, it leaves the 
most important question unresolved: What 
determines the size of somites? Dias et al. 
present a mathematical model mainly based 
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on packing constraints of cells transitioning 
between a mesenchymal state and a polarized 
epithelium. But other processes transferring 
the long-range signal, such as diffusion, cell 
projection, and mechanical stress, could be 
mechanisms that determine the regular size. 
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CLIMATE CHANGE 


A Drier Future? 


Steven Sherwood' and Qiang Fu?? 


lobal temperature increases affect the 

water cycle over land, but the nature 

of these changes remains difficult to 
predict. A key conceptual problem is to dis- 
tinguish between droughts, which are tran- 
sient regional extreme phenomena typically 
defined as departures from a local climato- 
logical norm that is presumed known, and 
the normal or background dryness itself. This 
background dryness depends on precipitation, 
but also on how fast water would evaporate. 
As the planet warms, global average rainfall 
increases, but so does evaporation. What is the 
likely net impact on average aridity? 

Most studies of dryness focus on droughts 
rather than on the background aridity or 
changes thereto. They tend to rely on rela- 
tively simple measures that are useful for 
analyzing temporary anomalies but may not 
properly account for factors that govern the 
background state. Failure to explicitly account 
for changes in available energy, air humidity, 
and wind speed can cause some indices com- 
monly used for identifying droughts to diag- 
nose an artificial trend toward more drought 
in a warming climate (/). Recognition of this 
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problem has undone past claims that 
drought is on the rise globally, and 
led to weaker claims about observed 
drought trends in the most recent 
Intergovernmental Panel on Cli- 
mate Change report (2). However, 
that does not mean that conditions 
will not get drier (3, 4). 

A different way of approaching 
the problem is to try to capture the 
changes in background state, rather 
than temporary anomalies such as 
droughts. This can, for example, 
be done using the ratio of precipi- 
tation (P) to potential evapotranspiration 
(PET) based on the Penman-Monteith equa- 
tion (J, 5). PET is the evaporative demand 
of the atmosphere, calculated as the amount 
of evaporation one would get, with given air 
properties, from a completely wet surface. 
Over a body of water PET equals the true 
evaporation, but on land, the true evaporation 
will be less than PET unless the soil is satu- 
rated with water. The P/PET ratio may be near 
zero in a desert but can exceed unity in wet 
climates. If the P/PET ratio falls, it means that 
conditions get drier; if it rises, conditions are 
getting wetter. 

Recent observational studies have shown 
that P/PET is decreasing on average as the 
globe warms (5, 6). Climate model simula- 
tions (see fig. S1, panel A) (5) predict that 
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Global warming is likely to lead to overall 
drying of land surfaces. 


by 2100 under a high-emis- 
sions scenario, when climate is 
projected to be several degrees 
warmer than it is now, P/PET 
will drop much further in most 
tropical and mid-latitude land 
regions (see the figure). Such 
drops can shift a region to the 
next drier climate category 
among humid, subhumid, semi- 
arid, arid, and hyperarid condi- 
tions (the latter four together are 
denoted dryland). In one simula- 
tion, the area of global dryland 
is projected to expand by ~10% by 2100 (5). 
Models predict that India and northern tropi- 
cal Africa will become wetter, but nearly all 
other land regions are predicted to become 
drier. Under most scenarios, the drying would 
further intensify during the 22nd century. 
Global averages of precipitation and evap- 
oration must remain equal to each other on 
climate time scales. The observed and pre- 
dicted drying tendency in P/PET over land 
thus implies that PET there increases faster 
than does global evaporation (noting that 
precipitation changes similarly on land and 
oceans). If there were no land on Earth, PET 
globally could not increase faster than P; they 
would always be equal. Thus, the increase 
in P/PET must be peculiar to land surfaces. 
One might expect complex land-surface 
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Warmer, drier. Aridity increases in warmer climates, leading to expansion of dry climate zones. Evaporation 
and precipitation increase modestly, but on land, evaporative demand (broken wavy arrows) increases faster 
than precipitation, because the strong increases in air temperature and consequently saturated water vapor 
concentration over land (red bars at lower right) exceed growth in actual water vapor concentration (blue 
bars). Increases in sensible and latent heat (associated, respectively, with temperature and water vapor, and 
represented by the area of each bar) have the same sum over land and ocean, with sensible heat increasing 
more over land than oceans and latent heat increasing more over oceans. Relative humidity (ratio of blue to 


red bar length) decreases over land. 


responses involving soils or vegetation to 
be responsible, but recent research (7—/2) 
suggests that the overall drying trend on land 
is rooted in relatively simple atmospheric 
thermodynamics. 

The key factor causing drying is that land 
surfaces (and the air just above them) warm, 
on average, about 50% more than ocean 
surfaces (7). There is a simple and plausi- 
ble explanation for this long-remarked phe- 
nomenon, at least for low and mid-latitudes. 
The atmosphere keeps convective instability 
(which gives rise to cumulus clouds) small 
over both land and ocean regions. This insta- 
bility depends on the total latent and sensi- 
ble heat in air near the surface. Because the 
latent heat (determined by atmospheric water 
vapor concentration) is smaller over land and 
changes less upon warming (see the figure), 


sensible heat (determined by air temperature) 
must change more, explaining the enhanced 
land warming (7, 8). Indeed, if this enhanced 
warming did not happen, air over land would 
become less able to sustain clouds and pre- 
cipitation, thus drying and warming the land 
via increased sunshine. Enhanced warming 
of land surfaces relative to oceans thus occurs 
simply because continental air masses are 
drier than maritime ones, which in turn is a 
consequence of the limited availability of sur- 
face water. 

The second factor ensuring drying is 
that water vapor content over land does not 
increase fast enough relative to the rapid 
warming there. Nearly all water vapor in the 
atmosphere comes from the oceans, where 
the water vapor content of the overlying 
air increases by ~6% per degree Celsius of 


ocean surface temperature (9, /0). When 
this air moves onto land, its typical water 
vapor content (though reduced) reflects the 
amount that it held originally (//). Because 
the land warms faster than the oceans, how- 
ever, the humidity of the arriving air does not 
increase enough to maintain a constant rela- 
tive humidity. The latter must therefore fall 
on average (see the figure), as indeed seen 
in model simulations (//, /2) and observed 
on all continents (/0). Therefore, the satu- 
ration deficit (gap between actual and satu- 
ration water vapor concentration; see the 
figure), which is the key factor controlling 
PET, grows much faster in percentage terms 
than do other hydrological quantities. This 
increases the aridity. 

A map of the predicted change in annual 
mean near-surface relative humidity (see fig. 
S1, panel B) (/3) not only confirms a gen- 
eral decrease over most land regions, but 
also shows a pattern nearly identical to that 
of the change in P/PET. These similarities 
show that regional changes in near-surface 
humidity, soil moisture, and precipitation are 
tightly coupled. Increases in PET are mainly 
attributable to overall land warming rather 
than relative humidity change (/4), but the P/ 
PET ratio on land is reduced largely by the 
enhanced land warming relative to oceans 
(see the figure) and by the decreases in rela- 
tive humidity on land. The latter are negative 
over most land areas despite being slightly 
positive over oceans. Positive feedback from 
soil moisture changes is not needed to explain 
enhanced land warming, but likely amplifies 
it in some regions (/5). 

Regional variations in simulated aridity 
change may still be unreliable, or may reflect 
other changes such as poleward shifts of cli- 
mate zones (5). But the general trend toward 
a drier land surface appears to rest on rela- 
tively firm foundations. The predicted dry- 
ing would be sufficient to shift large portions 
of the Earth to new, drier climate categories 
(although the richer atmospheric CO, might 
mitigate the impact on some plants). The 
background drying is separate from, but may 
be compounded by, the expected trend toward 
more intermittent rainfall for a given mean 
rain rate (/6). 

As the above considerations show, focus- 
ing on changes in precipitation, as typical in 
high-profile climate reports (2), does not tell 
the whole story—or perhaps even the main 
story—of hydrological change. In particu- 
lar, it obscures the fact that in a warmer cli- 
mate, more rain is needed. Many regions will 
get more rain, but it appears that few will get 
enough to keep pace with the growing evapo- 
rative demand. 
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CHEMISTRY 


Capturing Surface 


Chris Nicklin 


he outer atomic layers of a solid or 

liquid play a central role in determin- 

ing the properties of the sample as 
a whole, because it is here where the mate- 
rial interacts with the external environment. 
Detailed knowledge of the arrangement of 
atoms at a surface or interface between two 
materials is required to understand and tune 
the material’s properties. This outer-layer 
structure is crucial for technological pro- 
cesses such as catalysis, lubrication, and 
electron transport. In surface x-ray diffrac- 
tion, surface structures are investigated by 
directing high-energy x-rays at a sample at 
grazing angles of typically less than 1° (/). 
On page 758 of this issue, Gustafson et al. 
outline a different geometry for these mea- 
surements, using even higher-energy x-rays 
and shallower angles to allow faster data col- 
lection, enabling dynamic surface restructur- 
ing processes to be captured (2). 

In surface x-ray diffraction, the dif- 
fracted intensity results from a combina- 
tion of x-rays scattered from the bulk of the 
sample and x-rays scattered from its sur- 
face (see the figure). Intense Bragg peaks 
occur where the bulk scattering exhibits 
constructive interference. The truncation 
of the sample at the surface leads to streak- 
ing between the Bragg peaks in the direc- 
tion perpendicular to the surface. These 
streaks, known as crystal truncation rods 
(CTRs) (3) show modulations in intensity 
that results from interference between the 
bulk-scattered and surface scattered x-rays. 
Additionally, ordered reconstructions of 
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Processes 


the outer atomic layers result in superstruc- 
ture rods, which have an intensity profile 
that depends only on the surface scattering. 
Modeling these modulations can reveal the 
surface structure and registry with the bulk 
with a resolution of <0.05 A. The ordered 
array of diffraction features (CTRs, super- 
structure rods, and Bragg peaks) formed 
by a single-crystal sample is known as the 
reciprocal space lattice. 

Traditionally, surface x-ray diffraction 
measurements have required a high resolu- 
tion diffractometer, which allows the sample 
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Conventional surface x-ray diffraction 


A powerful geometry. In the conventional 
geometry of surface x-ray diffraction (A), 
the Ewald sphere intersects a short section 
of the CTR, allowing only a small amount of 
data to be collected at one time. Gustafson 
et al.'s modified geometry (B) allows more 
diffraction features to be observed ona 
static large-area detector. 
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A modified surface x-ray diffraction geometry 
allows dynamic restructuring of surfaces to be 
studied. 


and detector to be accurately positioned at 
specific angles relative to each other. These 
instruments usually have five or six inde- 
pendent rotation axes that enable a particu- 
lar diffraction feature to be detected while 
maintaining the fixed angle of incidence (4). 
In most state of the art experiments, the scat- 
tered x-rays are collected by a small two- 
dimensional detector (5), where the inten- 
sity of the spot or streak on the detector 
results in a single point on the rod. 

One way to understand the x-ray scatter- 
ing process is through the Ewald construc- 
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tion (6), which leads to a representation of all 
the possible points where diffraction could 
occur, known as the Ewald sphere. However, 
it is only in the small area where the Ewald 
sphere intersects the reciprocal space lattice, 
and with the detector placed appropriately, 
that a diffracted signal from the sample is 
actually measured (see the figure, panel A). 
Measuring the full CTR requires the coor- 
dinated movement of three or four of the 
diffractometer axes (detector and sample), 
which is relatively slow. 

The geometry outlined by Gustafson 
et al. enables a much larger section of the 
CTR to be collected at once (see the figure, 
panel B). The higher x-ray energy increases 
the size of the Ewald sphere, meaning that 
it is flatter when it cuts through the CTR. 
This greater overlap explains why more of 
the rod is visible on the detector. A full CTR 
is recorded by rotating the sample perpen- 
dicular to its surface through a small angu- 
lar range, which moves the intersection 
point of the Ewald sphere and the rod. The 
higher energies also lead to the reciprocal 
space lattice being much more contracted in 


real space; reducing the angular separation 
between the scattered diffraction features. A 
suitably large detector can therefore collect 
many reflections simultaneously. 

This geometry has a number of benefits, 
as shown in Gustafson ef al.’s elegant study. 
A large data set can be acquired quickly, 
allowing dynamic restructuring of the sur- 
face to be monitored in detail during in 
situ processing. As an example, the authors 
determine the changes to the Pd surface dur- 
ing catalytic oxidation of carbon monoxide 
with subsecond time resolution. Addition- 
ally, the impressive volume of data collected 
can be visualized in a number of ways. The 
authors, for example, show an “in-plane” 
projection. Due to the inherent high reso- 
lution of surface x-ray diffraction, they can 
clearly observe the small shifts in peak posi- 
tions that result from strain in the recon- 
structed sample. 

The novel experimental geometry 
reported by Gustafson ef al. makes it possible 
to extend surface x-ray diffraction to experi- 
ments that are currently very difficult, if not 
impossible. In contrast to many other surface 


structural techniques, surface x-ray diffrac- 
tion is not restricted to a vacuum environ- 
ment. Ambient pressure studies, for example 
(7), better resemble real conditions in auto- 
motive catalysts or in atmospheric reactions. 
The technique could be used to characterize 
transient structural phases that may occur ina 
specific pressure or humidity range. It could 
also provide deeper insights into dynamic 
processes such as the mechanism of facet for- 
mation during gas exposure, by monitoring 
not only the intensity, but also the angle and 
splitting of the CTRs in three dimensions. 
The ability to visualize the data in simple- 
to-interpret ways adds to the benefits of this 
powerful surface structural technique. 
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Graphene Oxide Membranes 
for lonic and Molecular Sieving 


Baoxia Mi 


branes that enable fast solute separa- 

tions from aqueous solutions are essen- 
tial for processes such as water purification 
and desalination, sensing, and energy pro- 
duction (/—3). The two-dimensional struc- 
ture and tunable physicochemical proper- 
ties of graphene oxide (GO) offer an excit- 
ing opportunity to make a fundamentally 
new class of sieving membranes by stack- 
ing GO nanosheets (4-6). In the layered 
GO membrane, water molecules permeate 
through the interconnected nanochannels 
formed between GO nanosheets and follow 
a tortuous path primarily over the hydro- 
phobic nonoxidized surface rather than 
the hydrophilic oxidized region of GO (7). 
The nearly frictionless surface of the non- 
oxidized GO facilitates the extremely fast 
flow of water molecules (5). On page 752 


| onic and molecular sieving mem- 
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of this issue, Joshi et al. (8) further report 
that ions smaller in size than the GO nano- 
channel can permeate in the GO membrane 
at a speed orders of magnitude faster than 
would occur through simple diffusion. Size 
exclusion appears to be the dominant siev- 
ing mechanism. 

When dry, GO membranes made by vac- 
uum filtration can be so tightly packed (with 
a void spacing of ~0.3 nm between GO 
nanosheets) that only water vapor aligned in 
a monolayer can permeate through the nano- 
channel (5). Joshi et al. found that when 
such a GO membrane was immersed in an 
ionic solution, hydration increased the GO 
spacing to ~0.9 nm (8). Any ion or molecule 
with a hydrated radius of 0.45 nm or less 
could enter the nanochannel, but all larger- 
sized species were blocked (see the figure). 

Such a sharp size cutoff by the GO mem- 
brane has important implications in a myriad 
of separation applications. By adjusting the 
GO spacing through sandwiching appropri- 
ately sized spacers between GO nanosheets, 


Membranes made by properly spacing and 
bonding stacked graphene oxide nanosheets 
enable precise, superfast sieving of ions and 
molecules. 


a broad spectrum of GO membranes could 
be made, each being able to precisely sepa- 
rate target ions and molecules within a spe- 
cific size range from bulk solution. Com- 
pared with the typically wide pore-size 
distribution of commonly used polymeric 
membranes, the narrow channel-size distri- 
bution of GO membranes is truly advanta- 
geous for precise sieving. 

The hydration of GO in aqueous solu- 
tion, however, makes it more challeng- 
ing to manipulate the GO spacing within a 
subnanometer range than to enlarge it. For 
example, desalination requires that the GO 
spacing should be less than 0.7 nm to sieve 
the hydrated Na* (with a hydrated radius of 
0.36 nm) from water. Such small spacing 
could be obtained by partially reducing GO 
to decrease the size of hydrated functional 
groups or by covalently bonding the stacked 
GO nanosheets with small-sized molecules 
to overcome the hydration force. 

In contrast, an enlarged GO spacing (1 
to 2 nm) can be conveniently achieved by 
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GO membranes. (A) Water and small-sized ions and molecules (compared with 
the void spacing between stacked GO nanosheets) permeate superfast in the GO 
membrane, but larger species are blocked. (B) The separation capability of the 
GO membrane is tunable by adjusting the nanochannel size. (C) Several meth- 


ods for the synthesis of GO membranes have been reported or are envisioned; 
GO nanosheets can be physically packed by vacuum filtration (options 1 to 3), or 
they can be stabilized by covalent bonds, electrostatic forces, or both (options 4 
to 6) during layer-by-layer assembly. 


inserting large, rigid chemical groups (6) or 
soft polymer chains (e.g., polyelectrolytes) 
between GO nanosheets, resulting in GO 
membranes ideal for applications in water 
purification, wastewater reuse, and pharma- 
ceutical and fuel separation. If even larger- 
sized nanoparticles or nanofibers are used 
as spacers, GO membranes with more than 
2-nm spacing may be produced for possible 
use in biomedical applications (e.g., artifi- 
cial kidneys and dialysis) that require pre- 
cise separation of large biomolecules and 
small waste molecules. 

GO membranes can be synthesized either 
by vacuum filtration or by layer-by-layer 
(LbL) assembly, both of which are conducted 
in aqueous solution without any organic sol- 
vent involved and, hence, are more envi- 
ronmentally friendly. The GO membranes 
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prepared by vacuum filtration, either from 
a pure GO solution or a mixture of GO 
and spacers, might lack sufficient bonding 
between GO nanosheets. Because of the high 
hydrophilicity of GO, these membranes are 
likely to disperse in water, especially under 
cross-flow conditions typically encountered 
in membrane operations. In contrast, the LbL 
method is ideal for introducing an interlayer 
stabilizing force via covalent bonding (6), 
electrostatic interaction, or both effects dur- 
ing layer deposition. 

The GO membrane thickness can be 
readily controlled by varying the number of 
LbL deposition cycles. Theoretically, as few 
as two stacked GO layers would be needed 
to create a sieving channel. In reality, how- 
ever, deposition of additional GO layers 
is warranted to counteract the detrimental 
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effects of possible defects and nonuniform 
deposition of GO nanosheets on the mem- 
brane’s sieving capability. Finally, the LbL 
synthesis of GO membranes is highly scal- 
able and cost-effective, unlike the challeng- 
ing synthesis of monolayer graphene mem- 
branes, which requires the manufactur- 
ing of large-sized graphene sheets and the 
punching of nanopores with a narrow size 
distribution (9). 

Indeed, the GO membrane represents 
a next generation of ultrathin, high-flux, 
and energy-efficient membranes for pre- 
cise ionic and molecular sieving in aque- 
ous solution, with applications in numer- 
ous important fields. Future research is 
needed to understand thoroughly the trans- 
port of water and solutes in the GO mem- 
brane, especially to fundamentally elucidate 
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other potential separation mechanisms (e.g., 
charge and adsorption effects) in addition to 
size exclusion. More research is also needed 
to address specific issues concerning vari- 
ous exciting yet challenging applications in 
desalination, hydrofracking water treatment, 
and energy production, as well as in biomed- 
ical and pharmaceutical fields. Other largely 
unexplored areas include making multifunc- 


tional GO membranes with exceptional anti- 
fouling, adsorptive, antimicrobial, and pho- 
tocatalytic properties. 
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ENGINEERING 


Robots Acting Locally and Building 


Globally 


Judith Korb 


ermites are among the most fascinat- 

ing animal architects in nature; their 

mounds were first described in a sci- 
entific journal more than 200 years ago (/). 
How can such tiny insects, each less than 1 
cm in size and equipped only with a simple 
brain, construct air-conditioned buildings up 
to 500 times their size? Termites’ construc- 
tion principles differ fundamentally from 
those of human architecture. Humans build 
houses according to a blueprint, and the con- 
struction process is centrally guided by this 
plan. In contrast, social insects such as ter- 
mites build in a decentralized, self-organized 
manner. Each individual works rather inde- 
pendently and follows a set of simple rules; 
the interactions among the workers and the 
interaction of each worker with its environ- 
ment ensure an organized process without a 
central blueprint (2-4). On page 754 of this 
issue, Werfel et al. (5) describe the use of 
such insect principles to guide simple robots 
in constructing user-defined structures for 
human purposes. 

Central to the work of Werfel ef al. is the 
principle of stigmergy (6): Social insects 
use local information at the building site to 
coordinate building activity. As this informa- 
tion changes during the building process, the 
behavior is adjusted accordingly. An exam- 
ple in termites is the proposed deposition of 
chemical volatiles with the building particles 
that guide individuals to local construction 
sites. Similarly, Werfel et al.’s autonomous 
constructing robots move along a grid sys- 
tem and deposit building bricks next to other 
bricks. The robots are simple, even more so 
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than termite workers. The robots can only 
sense bricks and the other robots next to 
them. They can move backward or forward, 
turn in place, and climb one step up or down; 
they can pick up, carry, and deposit bricks. 
The robots adjust their behavior accord- 
ing to what they perceive locally when they 


Robots that act like termites can construct 
complex structures, guided only by simple 
rules and sensing their local environment. 


move along the grid system; the possibilities 
include “nothing,” other robots, and bricks. 
The exact “traffic rules” depend upon the 
structure to be built, and these rules are 
derived by an offline compiler that trans- 
forms three-dimensional representations 
of a desired structure into two-dimensional 


Coordinated construction. (A) This termite mound, 3 m high, is the air-conditioned home of a Macrotermes 
bellicosus colony. The mound is constructed by thousands of tiny workers that coordinate their building activity 
through local information at the construction site. (B) The robots developed by Werfel et al. use similar principles 


to construct complex structures. 
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projections and recursively determines the 
rules for the robots to follow. This system is 
extremely elegant, as it allows the autono- 
mous construction of any predefined struc- 
tures with simple robots. It is robust to 
failure because of its decentralization, 
and is very flexible in its adjustments. The 
approach by Werfel et al. differs from others 
in that it develops the lower-level rules from 
the final structure to be built. Commonly, 
the reverse approach is pursued in complex 
systems science, but determining the emer- 
gent higher-level result from lower-level 
rules has proved difficult. Hence, it is also 
still unclear how termites can construct their 
impressive castles from the simple behav- 
iors that researchers observe (3, 4). 


In both nature’s construction works and 
the structures created by the robots in the 
approach of Werfel ef al., the properties 
of the final product are crucial. A termite 
mound’s architecture can determine the suc- 
cess of a colony (7). Mounds that are bet- 
ter adapted to local environments will, as a 
rule, have more offspring; thus, improved 
building rules that are genetically encoded 
will spread over time through a population. 
What is different in nature is that it starts 
with “mutations” in the building rules that 
are then tested in the evolutionary process. 
Over the millennia, evolution tested differ- 
ent rules, and what we observe today are 
those that worked. They might not be per- 
fect, and the algorithms of Werfel et al. 


PERSPECTIVES 


might also show us whether termites could 
still “learn” from humans. 
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BIOCHEMISTRY 


Protein Folding, Interrupted 


Kim A. Sharp 


lobular proteins start their lives as 

linear chains of amino acids coming 

off the ribosome. Proteins must then 
fold into specific three-dimensional struc- 
tures to be functional. In 1957, the first such 
structure, of myoglobin, was determined at 
atomic resolution (/). Fifty-six years and 
90,000-plus protein structures later (2), 
we have a very good idea of the necessary 
requirements for a stable, specific structure. 
Key to these requirements is the formation 
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A 


Core questions. In a typical globular protein, the four-helix bundle Rop 
[Protein Data Bank code: 4D02 (10)] (A), water (blue) is excluded from the 
core because of efficient, interdigitated packing of apolar side chains 
(magenta), surrounded by polar side chains (green). By contrast, highly 
ordered water remains in the core in the antifreeze protein Maxi reported by 
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of a well-packed, largely anhydrous core 
(3). Yet, on page 795 of this issue, Sun et al. 
(4) report an antifreeze protein with a core 
mostly consisting of water. 

In globular proteins, the anhydrous pro- 
tein core provides both structural specificity 
and energetic stabilization (see the figure, 
panel A). Burial of apolar amino acid side 
chains inside the core relieves their unfavor- 
able interaction with water, a process known 
as the hydrophobic effect (5, 6). Even inte- 
gral membrane proteins, which function in 
the nonaqueous lipid bilayer of the mem- 
brane and adopt structural motifs that are 
quite different from those of globular pro- 


Sun et al. (B). Secondary structure is rendered in orange. Slices through the 
van der Waals atomic surface taken through the core of the two proteins are 
shown in wire frame. Van der Waals surface slices produced by Pdb2DSlice 
and rendered in Pymol (22). In (B), the central third of the 145 A-long Maxi 
protein is shown. 
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An antifreeze protein with a core consisting 
mostly of ordered water molecules violates 
widely held ideas about protein stabilization. 


teins, conform to this general principle. 
Here, the apolar side chains are on the out- 
side of the structure, but by forming close 
contacts with the apolar lipid tails, they are 
still removed from water (7). 

The remarkable structure of the anti- 
freeze protein Maxi reported by Sun et al. 
flaunts its violation of the anhydrous-core 
principle. Maxi is a 145 A—long four-helix 
bundle formed as a dimer of two-helix 
monomers. More than 400 highly organized 
water molecules form an integral part of 
Max1’s structure. The water is interleaved as 
a roughly two-molecule-thick layer between 
both intra- and intermonomer helix inter- 
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faces, effectively forming the core of the pro- 
tein and extending to the ice-binding surface 
(see the figure, panel B). 

Sun et al. show that the structure deter- 
mined from crystallography persists in 
solution as the active form of the protein. 
Although ordered water molecules can 
be detected in most high-resolution x-ray 
crystallographic structures, they are usu- 
ally located between replica molecules in 
the crystal lattice and are probably absent 
or quite rare when the protein is in native 
solution conditions, in contrast to the water 
inside Maxi. That the ordered water struc- 
ture in Maxi extends to the ice-binding sur- 
face is suggestive of the function of this 
unusual core, given that Maxi, as an anti- 
freeze protein, must bind ice nuclei and 
inhibit their growth to function. As Sun et al. 
suggest, this function may have driven the 
evolution of its unique water core, although 
clearly such a core is not necessary for anti- 
freeze activity. No other known antifreeze 
protein has a water core like Maxi’s. 

When a protein folds, it forms van der 
Waals (packing) interactions, hydrogen 
bonds, and electrostatic interactions between 
charged and polar side chains within the 
protein. Each of these interactions com- 
petes with interactions of comparable 
strength and number between water and 
protein in the unfolded state. No such com- 
petition exists for the hydrophobic effect: 


Entropically favorable release of water upon 
burying apolar groups unambiguously favors 
the folded state. Thus, it is widely accepted 
that stabilization of globular proteins occurs 
primarily through the hydrophobic effect (8). 
It is therefore startling that Maxi retains the 
very structure of water—‘semi-clathrate” in 
the words of Sun et al—whose formation 
around apolar groups and subsequent dis- 
appearance was deemed to be the hallmark 
of the hydrophobic effect (5) and pivotal for 
protein stabilization. Clearly, the balance of 
interactions that stabilize Maxi is quite dif- 
ferent from that used by most proteins. 

Maxi’s structure has intriguing implica- 
tions not only for the energetics of protein 
folding, but for the mechanism and kinet- 
ics as well. Examination of the anhydrous 
core of any protein with its convoluted but 
well-packed atom-atom interfaces (see the 
figure, panel A) raises the question of how 
the water is removed so efficiently dur- 
ing folding. Removal of this water has been 
proposed as a potential rate-limiting step 
in protein folding. Two competing mecha- 
nisms have been proposed: Either the water 
is driven out as the protein collapses, or the 
unfavorable hydration free energy of apo- 
lar groups leads to their spontaneous dewet- 
ting (9). In the first mechanism, protein pack- 
ing drives dehydration in the manner of a 
squeegee, whereas in the second, packing fol- 
lows dehydration. 


Resolution of this question depends 
on the relative magnitudes of packing and 
hydration forces, which have proved dif- 
ficult to determine by experiment or the- 
ory. Moreover, proteins may well use both 
mechanisms. But it seems Maxi did not get 
the memo on how to fold: It chooses nei- 
ther route to dehydration. Maxi folds to the 
point where water not in direct contact with 
the protein chain is removed from its core. It 
then arrests further folding to retain a beau- 
tifully ordered core of water interleaved 
between the protein helices. Further analy- 
sis of the energetics and kinetics of folding 
of Maxi will deepen our understanding of 
protein folding and stabilization. 
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NEUROSCIENCE 


Genetic Resolutions of Brain 


Convolutions 
Brian G. Rash’ and Pasko Rakic'? 


ortical convolutions—prominent 

folds on the surface of the human 

brain—have a long history of specu- 
lation (/). The claims range from their func- 
tion as a bodily cooling system to the attri- 
bution of Einstein’s genius to the unusual 
shape of a single gyrus (the ridge of a cor- 
tical fold). Only recently, with advances in 
molecular genetics and brain imaging tech- 
niques, has it become possible to study the 
development, evolution, and abnormalities 
of cerebral convolutions in a scientifically 


‘Department of Neurobiology, Yale University, New Haven, 
CT 06510, USA. ?Kavli Institute for Neuroscience, Yale Uni- 
versity, New Haven, CT 06510, USA. E-mail pasko.rakic@ 
yale.edu 


rigorous manner (2). On page 764 of this 
issue, Bae et al. (3) show that a specific gene 
controls the number of gyri that form in a 
region of the cerebral cortex that includes 
Broca’s area (the major language area). This 
begins to pinpoint mechanisms that under- 
lie the development of specialized regions 
of the human brain and may be relevant to 
understanding human brain evolution. 

Bae et al. examined individuals, from 
three consanguineous families, with abnor- 
mal cortical folding restricted to a region 
surrounding the Sylvian fissure, includ- 
ing Broca’s area within the frontal lobe. 
Through a genome-wide linkage analysis, 
the authors traced the abnormality to muta- 
tions in the noncoding regulatory region of 


Genetic analysis of human brain abnormalities 
aids our understanding of how the cerebral 
cortex develops and evolves. 


the GPR56 gene. GPR56 encodes a pro- 
tein that functions in cell adhesion and 
guidance. Mutations caused the peri-Syl- 
vian cortex to be thinner and smoother and 
exhibit multiple shallow indentations (poly- 
microgyria). Moreover, the authors discov- 
ered a natural, spontaneous mutation in the 
GPR56 locus, which points to a mechanism 
that underlies both the formation of cortical 
maps and the process of gyrification. Bae et 
al. generated transgenic mice in which dif- 
ferent expression patterns of a reporter gene 
(encoding B-galactosidase) could be driven 
by part of the noncoding region of GPR56 
(a minimal promoter) taken from human, 
marmoset, dolphin, cat, and mouse. This 
indicates evolutionary changes in cortical 
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Localized gyral abnormalities. (A) 
Early embryonic stage in an individual 
with the GPR56 mutation showing the 
prospective areas surrounding the Syl- 
vian fissure (red) and adjacent cortex 
(blue) within the indicated zones. (B) 
Middle stage of corticogenesis indicat- 
ing the prospective normal and affected 
peri-Sylvian cortical areas. (C) Postmi- 
gratory stage, showing abnormal gyrifi- 
cation and cytoarchitecture in the peri- 
Sylvian region flanked by normal corti- 
cal areas. 


GPR56 expression. Overexpres- 
sion of GPR56 in mice increased 
proliferation of neuroprogenitor 
cells in the cortex, whereas loss 
of GPR56 expression had the 
opposite effect. 

The results of Bae ef al. are 
predictable by the radial unit 
hypothesis and related protomap 
hypothesis, which concern the 
formation of the cortical areas 
and convolutions. These concepts 
have framed our current under- 
standing of the development and 
evolution of cortical areas and 
their convolutions in the context 
of the genetic regulation of pro- 
liferation and migration of newborn neurons 
into columns within the overlying cortical 
plate (4). The models explain why the cor- 
tex normally expands not as a lump or glob- 
ular nucleus (such as the striatum or thala- 
mus), but as a relatively uniform thin, flat 
sheet composed of an array of radial units. 
Because the ventricular zone (VZ) of the 
cerebral cortex produces deep layers, and 
the subventricular zone (SVZ) produces 
neurons destined mostly for the superficial 
layers, both zones must be equally engaged 
in this complex process (5) (see the figure). 
The horizontal constraints provided by the 
radial glial cell scaffolding, which is so 
prominent in the gyrencephalic brains of 
human and nonhuman primates, explains 
why the larger cortex became convoluted 
and forms primary, and initiates secondary, 
furrows (sulci) on the brain’s surface even 
before birth (2, 4). 

Animal studies have shown that the size 
and position of various cortical areas can 
be manipulated in a region-specific man- 
ner by molecules that control morphogene- 
sis (6—/0), indicating that the pattern of the 
primordial cortical protomap is malleable at 
early embryonic stages. Mechanistically, for 
example, a local increase in the prolifera- 
tive rate of a prepatterned regional popula- 
tion of radial glial cells (cortical neural stem 
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cells) at early stages will add radial units and 
enlarge the surface area for that population 
compared to the rest of the cortical primor- 
dium. By contrast, a local decrease in pro- 
liferation at early stages would diminish the 
cortical surface, and if continued at later 
stages, would also decrease cortical thick- 
ness (see the figure). The results of Bae et 
al., in which both VZ and SVZ are involved, 
are predictable by these models and support 
the idea that cortical size and its foldings 
are an extended property of local cell pro- 
liferation in the transient embryonic zones. 
The grossly distorted and disorganized cor- 
tical columns observed by Bae ef al. could 
be attributed to the abnormal shape of radial 
glial cells, some of which lose their attach- 
ments to the pial surface (the thin membrane 
that surrounds the brain). 

After the protomap is established, the 
final pattern of gyri in each area is probably 
a product of both differential cellular prolif- 
eration and subsequent formation of connec- 
tions during cortical maturation (//). Bae et 
al. show that gyrification in human cortex 
is a regional event controlled by local pro- 
liferation that starts at early stages, before 
the initiation of neuronal connections (2, 4). 
This supports the existence of mechanisms 
that locally regulate gyrification, but the 
extent to which they are intermingled with 
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the mechanisms of protomap 
patterning or the subsequent for- 
mation of abnormal connections 
requires further investigation. 
The study by Bae et al. raises 
questions about how new con- 
volutions and even new cortical 
lobes can be created during evo- 
lution by simple genomic muta- 
tions (2). That areas and gyri can 
be modified or created at will in 
the laboratory by altering a single 
gene or factor that controls mor- 


Outer SVZ phogenesis provides insight into 
SVZ the possible mechanisms of cor- 
VZ 


tical development (6-10, 12, 13), 
but whether such experimentally 
induced changes are biologi- 
cally useful is another question. 
Bae et al. have shown that a sin- 
gle gene mutation can reconfig- 
ure the cortex in a functionally 
deleterious manner, but through 


Cortex evolution, such a mutation may 

prove to be advantageous for the 
White survival of species. The GPR56 
matter 


promoter can drive gene expres- 
sion in the lateral cortex in non- 
human species, indicating that it 
is important, but not sufficient, 
to create a new area such as Broca’s, so the 
search for the decisive gene(s) should con- 
tinue. The study by Bae ef al. demonstrates 
the potential of using advanced methods in 
human to identify specific genes and test 
their function in animals in order to obtain 
information on the origin of human unique- 
ness (2, 14, 15). 
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Toddler: An Embryonic Signal 
That Promotes Cell Movement 
via Apelin Receptors 
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Shengdar Q. Tsai, J. Keith Joung, Alan Saghatelian, Alexander F. Schier* 


Introduction: Embryogenesis is thought to be directed by a small number of signaling pathways 
with most if not all embryonic signals having been identified. However, the molecular control of 
some embryonic processes is still poorly understood. For example, it is unclear how cell migration is 
regulated during gastrulation, when mesodermal and endodermal germ layers form. The goal of our 
study was to identify and characterize previously unrecognized signals that regulate embryogenesis. 


Methods: To identify uncharacterized signaling molecules, we mined zebrafish genomic data sets 
for previously non-annotated translated open reading frames (ORFs). One such ORF encoded a 
putative signaling protein that we call Toddler (also known as Apela/Elabela/Ende). We analyzed 
expression, production, and secretion of Toddler using RNA in situ hybridization, mass spectrometry, 
and Toddler—GFP fusion proteins, respectively. We used transcription activator-like effector (TALE) 
nucleases to generate frame-shift mutations in the toddler gene. To complement loss-of-function 
analyses with gain-of-function studies, Toddler was misexpressed through mRNA or peptide 
injection. We characterized phenotypes using marker gene expression analysis and in vivo imaging, 
using confocal and lightsheet microscopy. Toddler mutants were rescued through global or localized 
toddler production. The relationship between Toddler and AP}/Apelin receptors was studied through 
genetic interaction and receptor internalization experiments. 


Results: We identified several hundred non-annotated candidate proteins, including more than 20 
putative signaling proteins. We focused on the functional importance of the short, conserved, and 
secreted peptide Toddler. Loss or overproduction of Toddler reduced cell movements during zebra- 
fish gastrulation; mesodermal and endodermal cells were slow to internalize and migrate. Both the 
local and ubiquitous expression of Toddler were able to rescue gastrulation movements in toddler 
mutants, suggesting that Toddler acts as a motogen, a signal that promotes cell migration. Toddler 
activates G-protein—coupled APJ/Apelin receptor signaling, as evidenced by Toddler-induced inter- 
nalization of APJ/Apelin receptors and rescue of toddler mutants through expression of the known 
receptor agonist Apelin. 


Discussion: These findings indicate that Toddler promotes cell movement during zebrafish gastrula- 
tion by activation of APJ/Apelin receptor signaling. Toddler does not seem to act as a chemo-attractant 
or -repellent, but rather as a global signal that promotes the move- 
ment of mesendodermal cells. Both loss and overproduction of Tod- 
dler reduce cell movement, revealing that Toddler levels need to be 
tightly regulated during gastrulation. The discovery of Toddler helps 
explain previous genetic studies that found a broader requirement 
for APJ/Apelin receptors than for Apelin. We propose that in these Toddler 
cases, Toddler—not Apelin—activates APJ/Apelin receptor signal- 
ing. Our genomics analysis identifying a large number of candidate | 
proteins that function during embryogenesis suggests the existence 
of other previously uncharacterized embryonic signals. Applying 
similar genomic approaches to adult tissues might identify addi- 
tional signals that regulate physiological and behavioral processes. 
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Fig. 1. Identification of the novel embryonic 
signal Toddler. 


Fig. 2. Toddler is essential for embryogenesis. 


Fig. 3. Abnormal gastrulation movements 
in toddler mutants. 


Fig. 4. Toddler functions as a motogen. 
Fig. 5. Toddler acts via Apelin receptors. 


Fig. 6. Toddler drives internalization of 
Apelin receptors. 
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Toddler promotes gastrulation movements via 
Apelin receptor signaling. Toddler is an essential, 
short, conserved embryonic signal that promotes cell 
migration during zebrafish gastrulation. The inter- 
nalization movement highlighted by the colored cell 
tracks requires Toddler signaling. Toddler signals via 
the G-protein—coupled APJ/Apelin receptor and may 
be one of several uncharacterized embryonic signals. 
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It has been assumed that most, if not all, signals regulating early development have been 
identified. Contrary to this expectation, we identified 28 candidate signaling proteins expressed 
during zebrafish embryogenesis, including Toddler, a short, conserved, and secreted peptide. 
Both absence and overproduction of Toddler reduce the movement of mesendodermal cells 
during zebrafish gastrulation. Local and ubiquitous production of Toddler promote cell movement, 
suggesting that Toddler is neither an attractant nor a repellent but acts globally as a motogen. 
Toddler drives internalization of G protein—coupled APJ/Apelin receptors, and activation of 
APj/Apelin signaling rescues toddler mutants. These results indicate that Toddler is an activator 
of APJ/Apelin receptor signaling, promotes gastrulation movements, and might be the first in a 


series of uncharacterized developmental signals. 


any of the inductive events during 
Me development are directed by a 

small number of signaling pathways 
whose agonists have been known for more than 
a decade (/). Therefore, it has been assumed that 
most, if not all, embryonic signals have been 
identified. However, the molecular control of 
some embryonic processes is still poorly under- 
stood. For example, it is largely unclear how 
cell migration is regulated during gastrulation 
or how cells coalesce into discrete tissues during 
organogenesis (2-5), suggesting that some of 
the involved signals are yet to be identified. More- 
over, recent genomic studies have suggested that 
translation of short open reading frames (ORFs) 
and the generation of small peptides are much 
more pervasive than previously assumed (6, 7). 
To search for new candidate signaling molecules, 
we used the Translated ORF Classifier (TOC) (7) 
to examine zebrafish transcript annotations and 
ribosome profiling data sets (7-9) for non- 
annotated translated ORFs (Fig. 1A) (materials 
and methods in the supplementary materials). This 
analysis identified 700 novel protein-coding tran- 
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scripts (399 loci) (supplementary data files S1 
and $2), of which 81% (562 transcripts in 325 
loci) shared nucleotide sequence alignments with 
other vertebrates (table S1). Notably, this ap- 
proach identified 28 candidate signaling proteins 
(40 transcript isoforms) characterized by the 
presence of putative signal sequences and lack 
of predicted transmembrane domains (table S1). 
Ribosome profiling and phylogenetic analysis 
suggest that these RNAs can generate secreted 
peptides with lengths ranging from 32 to 556 
amino acids (Fig. 1A, fig. S1, and table S1). Al- 
though these genes have not been identified pre- 
viously or are annotated in the zebrafish Ensembl 
database as noncoding RNAs, the majority (24 
of 28) appear to be conserved in other verte- 
brates (fig. S1 and table S1). 


Toddler Encodes a Short, Conserved, 
and Secreted Peptide 


To test the functional potential of these candidate 
signals, we focused on a gene that we named toddler 
on the basis of the phenotype described below (Fig. 
1B). Toddler (tdl) mRNA is expressed ubiquitously 
during late blastula and gastrula stages and becomes 
restricted to the lateral mesoderm, endoderm, and 
anterior and posterior notochord after gastrula- 
tion (Fig. 1C). Zoddler is annotated as a non- 
coding RNA in zebrafish (ENSDARG00000094729), 
mouse [Gm10664; also called Ende (10)], and 
human (LOC100506013) (fig. S2) and is present 
in two IncRNA catalogs (9, 17); however, it con- 
tains a 58—-amino acid ORF with a predicted signal 
sequence and high conservation in vertebrates, in- 
cluding human (Fig. 1D and fig. $3). Sequence 
comparisons with the highly conserved C-terminal 
portion did not identify homology to any other 
known proteins, raising the possibility that this 
gene encodes an uncharacterized embryonic signal. 
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Six lines of evidence indicate that toddler is 
translated and encodes a secreted peptide. First, 
phylogenetic comparisons of synonymous versus 
nonsynonymous codon changes reveal strong amino 
acid preservation in the toddler ORF (PhyloCSF 
score of 98 (8); see Fig. 1, B and D, and table S1). 
Second, previous ribosome profiling data in mouse 
(6) and zebrafish (7) indicate that the toddler ORF 
is protected by actively translating ribosomes 
in vivo (Fig. 1B). Third, mass spectrometric anal- 
ysis of nontrypsinated protein extracts from em- 
bryos expressing toddler mRNA detected the 
11—amino acid C-terminal Toddler peptide frag- 
ment that is predicted to be a convertase cleavage 
product (Fig. 1D and fig. S4). Fourth, enhanced 
green fluorescent protein (eGFP) fusion proteins 
containing the wild-type signal sequence of Tod- 
dler are found extracellularly, whereas signal pep- 
tide cleavage site mutants are retained in the cell 
(Fig. 1E). Fifth, as described below, extracellular 
injection of in vitro—synthesized Toddler pep- 
tide (C-terminal 21 amino acids) elicits the same 
gain-of-function phenotypes as excess of tod- 
dler mRNA. Sixth, wild-type but not frameshifted 
toddler mRNA rescues toddler mutants (see be- 
low), providing direct evidence that it is the pep- 
tide product rather than the RNA that is functional 
in vivo. Together, these findings identify Toddler 
as a short, conserved, and secreted peptide. 


Toddler Is Essential for Embryogenesis 


To disrupt toddler function, we generated mu- 
tants by TALEN-mediated mutagenesis (fig. S5 
and materials and methods) (/2, 13). Seven tod- 
dler alleles were recovered, each of which intro- 
duces a frameshift immediately after the signal 
peptide sequence (fig. S5, B and C). The vast 
majority of homozygous toddler mutants die be- 
tween 5 and 7 days of development and display 
small or absent hearts, posterior accumulation 
of blood cells, malformed pharyngeal endoderm, 
and abnormal left-right positioning and formation 
of the liver (Fig. 2, A and B, and fig. S6). Pene- 
trance and expressivity of toddler mutants vary, 
including occasional escapers that live to adult- 
hood and rare instances of toddler mutants that 
display more severe defects in endoderm and 
mesoderm formation (fig. S7). Notably, the le- 
thality of toddler mutants (survival, 0 of 25 ani- 
mals) was rescued by injection of low amounts 
(2 pg) of wild-type (survival, 23 of 30 animals) 
but not frameshifted (survival, 0 of 32 animals) 
toddler mRNA (Fig. 2, A, C, and D). Embryon- 
ically rescued toddler mutants survived to adulthood 
and were fertile in the absence of any later source 
of Toddler peptide, indicating that zebrafish Tod- 
dler is only essential during early embryogenesis. 


Toddler Is Required for Normal 

Gastrulation Movements 

To determine when Toddler function is required 
during early embryogenesis, we used a heat shock— 
inducible transgene. Induction of toddler expres- 
sion during late blastula and early gastrula stages, 
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but not at later times, rescued toddler mutants 
(fig. S8 and materials and methods). 

The early requirement for Toddler, together 
with its expression peak during gastrulation (Fig. 
1C), suggested that the later phenotypes originate 
from earlier defects. We therefore analyzed mor- 
phology and gene expression during blastula and 
gastrula stages and discovered that toddler mutant 
mesendodermal progenitors did not move prop- 
erly toward the animal pole during gastrulation. 
Although ventral and lateral mesendodermal cells 
in wild-type embryos internalized at the margin 
and moved toward the animal pole (Fig. 2, C and 
E), these cells were closely packed and confined 


to a band near the margin in toddler mutant em- 
bryos (Fig. 2, C and D, and fig. S9). These defects 
were apparent by analysis of endodermal (sox/7) 
and mesodermal (fibronectinI/fnl, spadetail/thx16, 
fascin, draculin/drl) markers (Fig. 2C and fig. S9). 
In contrast, ectodermal (sox3), dorsal mesodermal 
(goosecoid/gsc, hgg1), and tail mesodermal (ntla) 
markers were largely unchanged in their expression 
domains (fig. S10). In addition to the ventrolateral 
movement defects, toddler mutants contained ~20% 
fewer endodermal cells at mid-gastrulation (Fig. 
2, C and D, and fig. S9A). The initial expression of 
mesendodermal markers appeared unaffected 
(fig. S10B), suggesting that mesendodermal cells 


are specified normally in toddler mutant embryos 
but proliferate less. Notably, the toddler gastrula- 
tion phenotypes could be rescued by injecting low 
levels (2 pg) of toddler mRNA at the one-cell stage 
(Fig. 2, C and D, and fig. S9, A and C). These 
results reveal an important role for Toddler in the 
movement of ventral and lateral mesendodermal 
cells during gastrulation. 


Toddler Promotes Endodermal 

and Mesodermal Cell Migration 

To determine how Toddler affects the movement of 
cells during gastrulation, we performed live cell im- 
aging and followed cell trajectories in wild-type and 
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Fig. 1. Identification of the novel embryonic signal Toddler. (A) Over- 
view of the individual steps used to identify novel coding and noncoding 
transcripts. SP, signal peptide; RPFs, ribosome protected fragments. (B) Ge- 
nomic features of toddler. Coverage tracks for RNA-Seq (black) and ribosome 
profiling (blue), and tracks outlining the highest scoring regions in PhyloCSF 
(orange). Note that both PhyloCSF (8) and ribosome profiling (7) predict toddler 
to be protein-coding. (C) Expression analysis of toddler transcripts during em- 
bryogenesis. toddler transcripts peak during gastrulation [RNA-Seq data (8)]. 
FPKM, fragments per kilobase of transcript per million mapped reads. RNA in 
situ hybridization reveals ubiquitous expression of toddler transcripts at the be- 
ginning of gastrulation [6 hours postfertilization (hpf)]; expression becomes 
restricted to mesendodermal cells toward the end of gastrulation (9 hpf). 
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nt, notochord; lpm, lateral plate mesoderm; endo, endoderm. (D) Toddler is con- 
served in vertebrates. ClustalW2 multiple protein sequence alignment of Toddler 
peptide sequences from five vertebrates. Darker shading indicates higher per- 
centage identity of the amino acid. The predicted signal peptide cleavage site 
and the highly conserved C-terminal 11—amino acid (aa) peptide fragment 
that was detected by mass spectrometry are indicated. (E) Toddler signal 
sequence drives secretion. Injection of mRNAs encoding C-terminal Toddler- 
eGFP fusion proteins reveals that the wild-type Toddler signal sequence drives 
secretion (extracellular localization of eGFP), whereas mutation of A>W in the 
signal peptide cleavage site causes Toddler-eGFP to remain intracellularly 
(top, wild-type Toddler ORF; bottom, A>W mutant Toddler ORF). Fusion pro- 
tein diagrams are not drawn to scale. Scale bars, 20 um. 
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Fig. 2. Toddler is essential for embryogenesis. (A) Morphological analysis 
of toddler mutants. TALEN-induced toddler null mutants (see fig. $5) lack a 
functional heart, have no blood circulation, and accumulate blood posteriorly 
(black arrowheads). Defects in toddler mutant embryos are rescued by low doses 
(2 pg) of toddler mRNA. Injection of higher doses of toddler mRNA (>10 pg) 
causes phenotypes in wild-type embryos reminiscent of toddler loss-of-function 
mutants. Shown are lateral views of embryos of the indicated genotypes at 
30 hpf. (B) Marker gene analysis in wild-type and toddler mutant embryos at 
36 hpf (cmlc2), at the 8 to 10 somite stage (scl/tal), at 30 hpf (foxa2), and 
at 3 days postfertilization [ceruloplasmin (cp)]. Black arrows indicate lack 
of or reduced staining in toddler mutant embryos; black arrowheads indi- 
cate ectopic expression; white arrowheads point to the liver in wild-type (>70% 
on left side) and toddler mutant embryos (expression: 45% right, 15% medial, 
40% none/nonspecific). (C) Toddler is required for movement of ventrolateral 
endoderm and mesoderm toward the animal pole. Both absence of Toddler 
(toddler) and overexpression of toddler mRNA (wild-type embryos + 10 pg of 
toddler mRNA) reduce the movement of endodermal (sox17) and mesodermal 
[fibronectin 1 (fn1)] cells toward the animal pole, as detected by in situ hy- 
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bridization. All in situ images are lateral views of embryos at 70% epiboly 
(dorsal to the right). Illustrations of the observed endodermal (blue) and 
mesodermal (red) phenotypes in wild-type (wt) and toddler mutant (tdl) em- 
bryos are shown on the right. (D) Quantification of the endodermal defects 
at 70% epiboly. Left, relative spread of lateral endoderm along the animal- 
vegetal axis (that is, height of lateral band of sox17-expressing cells divided 
by the wild-type mean); right, number of endodermal cells within a lateral, fixed- 
size area. Gray, wild-type genomic background; cyan, toddler mutant genomic 
background. P values for pairwise comparisons with wild-type (black, top) or 
toddler mutant (cyan, bottom) were calculated on the basis of a standard Welch’s 
t test (*P < 0.01; **P < 0.00001). (E) Illustration of early gastrulation move- 
ments in wild-type zebrafish embryos. Mesodermal (red) and endodermal 
(blue) cells are induced and internalized at the margin (40% epiboly stage). 
Whereas internalized cells migrate toward the animal pole in either a direc- 
tional (mesoderm) or random walk—like pattern (endoderm) (3, 15), epiboly 
movements are directed toward the vegetal pole (gray arrows). At 70% 
epiboly, mesodermal and endodermal cells have moved animally and cover 
most of the lateral side of the embryo. 
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toddler mutant embryos (movies S1 to S6). 
Toddler mutant endodermal cells [sox/7::GFP 
(14)] displayed reduced movement toward the 
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Fig. 3. Abnormal gas- 
trulation movements in 
toddler mutants. (A and 
B) Analysis of endodermal 
cell migration in sox17::eGFP 
transgenic wild-type and 
toddler mutant embryos by 
confocal microscopy. Green, 
endodermal cells (marked 
by sox17::eGFP); red, nuclei 
[human histone2B-RFP (H2B- 
RFP) mRNA injection]. (A) 
Still images of maximum 
intensity projections of a 
time-lapse movie from 60 
to 90% epiboly (movies S1 
and $2). (B) Quantification 
of the average (median) ve- 
locity of endodermal cells 
(left), displacement versus 
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to wild-type cells (Fig. 3B and fig. S11). During 
early gastrulation, todd/er mutant endodermal 
cells exhibited the characteristic random walk— 
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display the random move- 
ment of endodermal cells 
during early gastrulation and the more directional migration at later stages 
[animal (A), posterior (P), dorsal (D), ventral (V)]. (C to 1) Analysis of early gas- 
trulation movements in H2B-RFP mRNA injected wild-type and toddler mutant 
embryos by light-sheet microscopy (single-plane illumination microscopy). 
(C to H) Internalization and animal pole—directed movement of lateral mesendo- 
dermal cells are reduced in toddler mutants. Analyses are shown for lateral cross 
sections of a time-lapse movie (movie S4) of a wild-type—toddler mutant embryo 
pair, imaged in parallel at 90-s intervals within a single experiment. (C) Still 
images of maximum intensity projections of 40-um lateral cross sections 
(20 z-slices) during the time of internalization (time in minutes:seconds). Movies 
were aligned at 50% epiboly (48:00). Leading edges of internalizing mesendo- 
dermal cells (yellow dots) and vegetally moving cells (green dots) highlight the 
opposing paths of cells during gastrulation. Red stars mark the onset of cell 
internalization. (D) Comparison of animally and vegetally directed migratory 
paths of the wild-type and mutant embryo pair shown in (C). Frame-to-frame 
displacements (plotted on the left) were used to derive the net animal pole— 
directed cell movement. Toddler mutants (cyan) show delayed onset of inter- 
nalization and reduced step-to-step and net animal pole—directed movement. (E to 
G) Cell tracking and digital analysis of gastrulation movements. (E) Position, speed 
(dot size), and directionality [color-coded from blue (vegetal movement) to red 
(animal movement)] of tracked cells during and after the time of internalization 
[H(Int)]. Movies were aligned to the onset of internalization [t(Int) = 00:00; time 
in hours:minutes]. (F and G) Cell tracks before (tf < —5 min), during (-5 min <t < 
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1 hour), and after (t > 1 hour) internalization in wild-type and toddler mutant 
embryos. In (F), tracks were color-coded on the basis of the total number of animal 
pole—directed (red) or vegetal pole—directed (blue) movements, normalized to the 
total number of frames per track. In (G), tracks were color-coded on the basis of 
their relative position and directionality with respect to the margin at the time of 
internalization (margin cells: cells located within 100 um above the margin at the 
onset of internalization). Gray, nonmargin cells; black, margin cells; red, internaliz- 
ing and upward-moving margin cells. (H) Quantification of the mean velocity of 
internalizing, animal pole—directed movement in wild-type and toddler mutant 
embryos. Mean track velocities were obtained from cell-tracking data derived from 
lateral cross sections of six wild-type (gray) and six toddler mutant (cyan) embryos, 
imaged in parallel in three independent experiments. Pooled wild-type and toddler 
mutant mean track velocities are plotted on the right (n = number of cell tracks). 
(I) Toddler mutant embryos are defective in ventrolateral but not dorsal inter- 
nalization. (Left) Still image of maximum intensity projections of 40-um dorsal- 
ventral cross sections (20 z-slices) of a wild-type—toddler mutant embryo pair 
110 min after the onset of internalization. Arrows highlight the paths that the 
leading internalizing cells took on dorsal (D, dashed white line) and ventral (V, 
solid yellow line) sides of the embryo. Ventral movement toward the animal pole 
is severely reduced in the toddler mutant embryo, whereas dorsal internalization 
occurs normally. (Right) Quantification of the fraction and speed of internalizing 
marginal cells based on their positioning in the embryo (dorsal versus ventral) and 
genotype [wild type (gray) versus toddler mutant (cyan)] (see also movie $6). 
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like migration pattern observed in wild-type em- 
bryos (3, /5), but they migrated in a less direc- 
tional fashion than their wild-type counterparts 
during later gastrulation (movie S1 and Fig. 3B). 

To analyze the earliest steps of mesendoderm 
movement, we followed the paths of H2B-RFP— 
labeled nuclei by light-sheet microscopy in wild- 
type and toddler mutant embryos (movie S3 and 
fig. S12). Analysis of 10 wild-type and 11 toddler 
mutant embryos confirmed that the movement 
of ventrolateral but not dorsal internalizing cells 
toward the animal pole was impaired in toddler 
mutants (Fig. 3, C to I, figs. S12 to S14, and 
movies $3 to S6). Internalization of ventrolateral 
cells at the margin was delayed (Fig. 3, C and 
D, fig. S13A, and movies S4 and S5) and re- 
duced (Fig. 3, E to G and I, fig. S13, and movies 
S3 to 6). Although internalization in wild-type 
embryos started about 30 min before embryos 
reached 50% epiboly, it often commenced only 
after the 50% epiboly stage in toddler mutants 
(Fig. 3, C and D, fig. S13A, and movies S4 and 
S5). Ventrolateral internalized cells moved more 
slowly (Fig. 3, H and I) and often piled up at the 
margin (Fig. 3, C and E, figs. S13 to S15, and 
movies S3 to S6). In addition, epiboly movements 
were often delayed in toddler mutants, particu- 
larly during the time of internalization (fig. $13, 
A and B). In rare cases, we observed an almost 
complete absence of animal pole—directed ven- 
trolateral cell movements; in these embryos, ven- 
tral and lateral marginal cells instead moved 
vegetally (movies $3, S5, and S6), likely con- 
tributing to the ectopic accumulation of pos- 
teriorly located blood cells at later stages (Fig. 2, 
A and B). These results identify Toddler as a key 
signal that promotes the internalization and ani- 
mal pole—directed movement of mesendodermal 
cells during gastrulation. 


Overexpression of Toddler Phenocopies 


Toddler Mutants 


In contrast to inducers of specific cell fates, many 
signals involved in cell migration or tissue mor- 


Animal 


Ventral ©) Dorsal 


Vegetal 


Wild type Dextran red->YSL_|| 


2 pg toddler -> YSL 


phogenesis share loss- and gain-of-function pheno- 
types. For example, both reduction and increase 
in Wnt/planar cell polarity signaling disrupt con- 
vergence and extension movements during gas- 
trulation (2-5). To determine whether Toddler 
might share this feature, we carried out overex- 
pression analyses. Injection of toddler mRNA at 
levels only five times higher (=10 pg) than needed 
for rescue caused phenotypes in wild-type em- 
bryos that resembled toddler loss-of-function mu- 
tants, including gastrulation and heart defects 
(Fig. 2, A, C, and D, and fig. S9, A and C). Sim- 
ilar phenotypes were observed upon extracellular 
injection of an in vitro-synthesized Toddler pep- 
tide fragment (C-terminal 21 amino acids; fig. S16). 
These observations reveal that proper levels of 
Toddler are required for normal mesendodermal 
movement and provide further evidence of an 
important role for Toddler in cell migration. 


Toddler Functions as a Motogen 


Most genes encoding signals that attract or re- 
pel cells are expressed in specific domains (/6), 
and ubiquitous production of such signals inter- 
feres with guided cell migration. In contrast, we 
find that toddler RNA is expressed ubiquitous- 
ly (Fig. 1C and fig. S17A) and that ubiquitous 
expression of toddler mRNA upon injection at 
the one-cell stage promotes the normal move- 
ment of mesendodermal cells in toddler mu- 
tants (Fig. 2, C and D). To further test the role 
of Toddler in cell migration, we locally expressed 
Toddler in the vegetal or animal regions of toddler 
mutants. In both scenarios, localized Toddler 
production was able to promote the migration of 
mesendodermal cells and rescue toddler mutants 
(Fig. 4). Although more complex scenarios are 
formally possible [for example, local process- 
ing (/7) and self-generated gradient formation 
(8, 19)], these results suggest that Toddler does 
not attract cells to or repel cells from specific sites. 
Instead, Toddler appears to act as a motogen 
(20-22)—a general promoter of mesendodermal 
cell migration. 


S80 © 
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Fig. 4. Toddler functions as a motogen. Ubiquitous or localized expression of Toddler promotes 
animal pole—directed endodermal cell migration in toddler mutant embryos. Toddler was expressed 
either vegetally from the yolk syncytial layer (YSL) (injection of toddler mRNA into the YSL) or animally 
from a toddler-overexpressing (OE) clone of cells transplanted into the animal pole. Dextran red injections 
into the YSL and transplantation of uninjected toddler mutant cells served as controls. Different treatments 
are illustrated on top; toddler expression domains are highlighted in cyan. All sox17 in situ hybridization 
images are lateral views of embryos at 70% epiboly (dorsal to the right). 
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Toddler Acts via APJ/Apelin Receptors 

To identify candidate receptors for Toddler, we 
compared the toddler phenotype to previously 
described receptor mutant phenotypes. On the 
basis of the small size of Toddler peptide and the 
involvement of G protein signaling in gastrulation 
movements, we focused on G protein-coupled 
receptors (GPCRs) as candidate Toddler recep- 
tors (4, 23-30). Four observations raised the pos- 
sibility that the G protein—coupled APJ/Apelin 
receptor might mediate Toddler signaling. First, 
loss of APJ/Apelin receptor signaling results in 
small hearts and affects lateral mesoderm mi- 
gration in zebrafish (24-26), phenotypes rem- 
iniscent of some aspects of the toddler mutant 
phenotype. However, in contrast to the broad 
roles of Toddler in mesendoderm migration, 
APJ/Apelin receptor signaling had been specif- 
ically implicated in cardiovascular development 
(24-26, 31-36). Second, overexpression of Apelin, 
the only known ligand for the APJ/Apelin recep- 
tor (35-38), interferes with gastrulation move- 
ments in zebrafish (24-26). Third, the expression 
levels of Apelin receptors and Toddler peak 
during gastrulation (Fig. SA), and Apelin recep- 
tors are expressed in mesendodermal cells 
[fig. S16A and (24, 25, 39)], the cell types 
affected in toddler mutants. Fourth, we found 
that Apelin is expressed only at the end of 
gastrulation [Fig. 5A and (24)], after the toddler 
and APJ/apelin receptor phenotypes (24, 25, 40) 
are apparent. These observations, together with 
the milder phenotypes observed in the absence 
of Apelin as compared to loss of APJ/Apelin 
receptors (24-26, 34, 36, 41-46), raised the hy- 
pothesis that Toddler might be the bona fide 
activator of APJ/Apelin receptor signaling dur- 
ing gastrulation. We tested three predictions of 
this model. 

First, we determined whether the absence of 
Apelin receptor function phenocopies toddler mu- 
tants. We reexamined ap/nra and aplnrb double 
morphants (24-26) and found phenotypes that 
were highly similar to toddler mutants, including 
reduced movement of ventrolateral mesendoderm 
during gastrulation (Fig. 5, B and C). We also 
confirmed and extended previous analyses of the 
effects of Apelin overexpression (24-26) and 
found defects very similar to those caused by Tod- 
dler overexpression (Fig. 5, B and C). In addi- 
tion, we observed that coexpression of Toddler 
and Apelin receptor at levels that individually 
did not cause major defects resulted in abnormal 
gastrulation movements reminiscent of Toddler 
and Apelin (24-26) overexpression phenotypes 
(Fig. 5D). These results reveal shared morpho- 
genetic activities of the Apelin receptor and Tod- 
dler signaling pathways. 

Second, we tested the epistatic relationship 
between Toddler and Apelin receptor signaling. 
The similarity of gain- and loss-of-function pheno- 
types precluded standard tests such as overex- 
pression of Toddler in Apelin receptor mutants. 
Instead, we tested whether activation of Apelin 
receptor signaling can bypass the requirement 
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Fig. 5. Toddler acts via Apelin receptors. (A) RNA-Seq—based expression 
levels of toddler, apelin, and apelin receptors (aplnra and aplnrb) during 
embryogenesis. (B) Genetic evidence for Toddler signaling via the Apelin 
receptor. Endodermal (sox17) and mesodermal [fibronectin 1 (fn1)] cell 
distributions were analyzed by in situ hybridization at 70% epiboly. Apelin 
receptor knockdown [aplnra/b morpholino (MO) injection] phenocopies 
toddler mutants, and Apelin production can rescue toddler mutants. Overex- 
pression of Apelin causes phenotypes resembling toddler mRNA overexpres- 
sion. (C) Quantification of the relative lateral spread of endoderm (left) and 
mesoderm (right). Quantifications are from multiple experiments (n = num- 
ber of embryos per category). P values for pairwise comparisons with wild 


type (black, top) or toddler mutant (cyan, bottom) were calculated on the 
basis of a standard Welch's ¢ test (*P < 0.01; **P < 0.00001). (D) Synergistic 
effect of Toddler and Apelin receptor b on endodermal cell migration. In- 
jection of toddler or aplnrb mRNA at low concentrations (2 and 15 pg, 
respectively) did not cause significant defects in animal pole—directed 
movement of endodermal cells (different batch of toddler mRNA than 
used in Fig. 2D). However, coinjection of both mRNAs reduced the extent of 
endoderm movement. Shown are the combined data of two independent ex- 
periments. P values for pairwise comparisons with wild type (top) or individual 
mRNA injections (bottom) were calculated on the basis of a standard Welch’s t 
test (*P < 0.01; **P < 0.00001). 


for Toddler. Apelin mRNA injection into tod- 
dler mutant embryos restored normal mesen- 
doderm migration (Fig. 5, B and C), cardiac 
development, and survival to adulthood. These 
results suggest that Toddler and Apelin activate 
the same signaling pathway. 

Third, we tested whether Toddler can drive 
the internalization of Apelin receptors (Fig. 6), a 
hallmark of activated GPCR signaling (47-50). 
We misexpressed toddler mRNA with eGFP- 
tagged Apelin receptor a or b and observed strong 
internalization of the receptors from the plasma 
membrane (Fig. 6B). This effect was specific 
because other signaling proteins (chemokines 
Sdfla/Cxcl12a or Sdflb/Cxcl12b) did not alter 
the distribution of membrane-bound Apelin re- 
ceptors, nor did Toddler alter the distribution of 
other chemokine receptors (Cxcr4a-eGFP, Cxcr4b- 
eGFP, and Cxcr7b-eGFP) (Fig. 6B and fig. S18). 
Moreover, Toddler produced from a local clone of 
cells was sufficient to cause Aplnrb-eGFP internal- 
ization at a distance from the source, suggesting 
that secreted Toddler peptide can act on neighbor- 
ing cells (Fig. 6C). This conclusion was further 
strengthened by the observation that extracellular 
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injection of in vitro—synthesized C-terminal Tod- 
dler or Apelin peptides induced efficient inter- 
nalization of ApInr-eGFP (Fig. 6D). These results 
indicate that Toddler activates Apelin receptors. 


Discussion 


Our study indicates that Toddler is an activator 
of APJ/Apelin receptor signaling, promotes gas- 
trulation movements (see summary in Fig. 6E), 
and may be the first in a series of previously un- 
known developmental signals. While this study 
was under review, Toddler (named ELABELA) 
was independently reported to signal via APJ/ 
Apelin receptors during endoderm differentia- 
tion and heart formation (5/). The HUGO Gene 
Nomenclature Committee (HGNC) has recently 
designated the name Apela (apelin receptor early 
endogenous ligand) as the standardized symbol 
for Toddler/ELABELA/Ende. Our results lead to 
four major conclusions. 

First, Toddler is a previously unrecognized 
signal that promotes cell movement during 
gastrulation. The rescue of toddler mutants by 
ubiquitous Toddler expression suggests that Tod- 
dler acts neither as a chemoattractant nor as a 


chemorepellent, but rather as a nondirectional 
signal to promote the internalization and move- 
ment of ventrolateral mesendodermal cells. Dorsal 
mesendoderm movement is largely unaffected 
in toddler mutants, consistent with the absence 
of Apelin receptor expression in this region and 
the role of other pathways in dorsal gastrulation 
movements (3). Both loss and overproduction 
of Toddler reduce cell movement, revealing that 
Toddler levels need to be tightly regulated to al- 
low for normal gastrulation. It remains to be de- 
termined whether Toddler promotes motility by 
regulating cell shape, cellular protrusions, cell- 
substrate interactions, and cell-cell adhesion or 
through other means. 

Second, Toddler-Apelin receptor signaling pro- 
vides a long-sought link between mesendoderm 
induction and migration. Nodal signaling not 
only induces mesendoderm formation (52) but 
also activates the expression of Apelin receptors 
(fig. S17B and (39)]. Thus, Nodal-mediated induc- 
tion of Apelin receptor expression might render 
cells competent to respond to Toddler and to 
become more motile (Fig. 6E). In this scenario, 
the activation of Apelin receptor expression in 
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Fig. 6. Toddler drives internalization of Apelin receptors. (A) Schematic 
illustration of different treatments used to test for Toddler-mediated Apelin 
receptor internalization. (B) Test for signal-mediated internalization of eGFP- 
tagged receptors in zebrafish by coinjection of signal and receptor-eGFP 
mRNA into one-cell stage toddler mutant embryos. Receptor internalization 
was monitored by confocal microscopy. White arrows point to fluorescent 
foci of internalized receptors. In the absence of signal peptide overexpres- 
sion, ectopically expressed receptors localize to the plasma membrane in 
pregastrulation toddler mutant embryos [see control Alexa543-dextran in- 
jections in (D)]. (©) Generation of a local source of Toddler or Sdf1a by injection 


toddler; Apinra-eGFP. 


toddler; Apinrb-eGFP. 


of toddler or sdfla mRNA (together with Alexa543-dextran as tracer) into a 
single cell at the 128-cell stage. Local expression of Toddler is sufficient to cause 
Aplnrb-eGFP internalization in cells that do not express toddler mRNA (non-red 
cells). (D) Extracellular injection of in vitro—synthesized C-terminal Toddler or 
Apelin peptide fragments is sufficient to drive internalization of Apelin receptors. 
(E) Model of the role of Toddler-Apelin receptor signaling in mesendodermal 
cell migration during zebrafish gastrulation. Left, wild type; right, toddler; 
top, 40% epiboly (mesendoderm specification and internalization); middle, 
70% epiboly (animal pole—directed cell movement); bottom, 90% epiboly 
(dorsal convergence). 


cells located at the margin at the end of the 
blastula stage would restrict the motogenic effects 
of Toddler and prevent ectopic and premature 
cell motility. 

Third, Toddler is a novel agonist of APJ/Apelin 
receptor signaling, as evidenced by Toddler-induced 
internalization of Apelin receptors and rescue of 
toddler mutants by production of the known re- 
ceptor agonist Apelin. Additionally, a fusion pro- 
tein of alkaline phosphatase and Toddler binds to 
cells expressing Apelin receptors (5/). Previous 
studies have implicated APJ/Apelin receptor 
signaling in a variety of biological processes, 
including the regulation of cardiovascular de- 
velopment and physiology, the control of fluid 
homeostasis, or even as a co-receptor for HIV 
infection (53, 54). Although Apelin has previously 
been the only known agonist of the APJ/Apelin 
receptor, genetic studies have found discrepancies 
between the roles of Apelin and its receptor in 
mouse (34, 36, 47, 45, 55) and zebrafish (24-26). 
For example, Apelin knockout mice are viable and 
fertile (45, 46, 56), whereas APJ/Apelin receptor 
mutant mice are born at sub-Mendelian ratios (34). 
Our studies suggest that both Toddler and Apelin 
can activate APJ/Apelin receptors and indicate 
that it is endogenous Toddler—not Apelin—that 
activates APJ/Apelin receptor signaling during zebra- 
fish gastrulation. Analogously to the promise of 
Apelin in biomedical applications (53, 54), Tod- 
dler and its derivatives may take the place of 
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Apelin in therapeutic contexts. Indeed, Toddler 
may also activate mammalian APJ/Apelin re- 


sion phenotypes in zebrafish (fig. $19). 


ioral processes. 
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A Genetic Atlas of Human 


Admixture History 


Garrett Hellenthal,? George B. J. Busby,” Gavin Band,? James F. Wilson,* Cristian Capelli,” 


Daniel Falush,°* Simon Myers”’°*t 


Modern genetic data combined with appropriate statistical methods have the potential to 
contribute substantially to our understanding of human history. We have developed an approach 
that exploits the genomic structure of admixed populations to date and characterize historical 
mixture events at fine scales. We used this to produce an atlas of worldwide human admixture 
history, constructed by using genetic data alone and encompassing over 100 events occurring over 
the past 4000 years. We identified events whose dates and participants suggest they describe 
genetic impacts of the Mongol empire, Arab slave trade, Bantu expansion, first millennium CE 
migrations in Eastern Europe, and European colonialism, as well as unrecorded events, revealing 
admixture to be an almost universal force shaping human populations. 


iverse historical, archaeological, anthro- 
Dover and linguistic sources of infor- 

mation indicate that human populations 
have interacted throughout history, because of the 
rise and fall of empires, invasions, migrations, 
slavery, and trade. These interactions can result in 
sudden or gradual transfers of genetic material, 
creating admixed populations. However, the ge- 


Fig. 1. Ancestry painting 
and admixture analysis 
of simulated admixture. 
(A) A simulated event 30 
generations ago between 
Brahui (80%, red) and 
Yoruba (20%, yellow) re- 
sulted in admixed individ- 
uals having DNA segments 
from each source (bot- 
tom). The true sources are 
then treated as unsam- 
pled. cM, centimorgan. (B) 
CHROMOPAINTER's paint- 
ing of the same region 
(yellow, Africa; green, Amer- 
ica; red, Central-South Asia; 
blue, East Asia; cyan, Eu- 
rope; pink, Near East; black, 
Oceania), showing haplo- 
typic segments (“chunks”) 
shared with these groups. B 
Our model fitting narrows 
the donor set largely to 
Central-South Asia and Af- 
rica, generating a “cleaned” 
painting. (C) Coancestry 
curves (black line) show 
relative probability of joint- 
ly copying two chunks 
from red (Balochi; Fo; = 


A 
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True (unobserved) history 


Chromosome painting 
Raw painting (chunks): 


Cleaned painting: 


0.003 with Brahui) and/or yellow (Mandenka; Fct = 0.009 with Yoruba) 
donors, at varying genetic distances. The curves closely fit an exponential 
decay (green line) with a rate of 30 generations (95% Cl: 27 to 33). The 
positive slope for the Balochi-Mandenka curve (middle) implies that these 
donors represent different admixing sources. (D) GLOBETROTTER’s source 


netic legacy of these interactions remains un- 
known in most cases, and the historical record is 
incomplete. We have developed an approach that 
provides a detailed characterization of the mixture 
events in the ancestry of sampled populations on 
the basis of genetic data alone. 

Admixed populations should have segments of 
DNA from all contributing source groups (Fig. 1A), 


D 
\ 


whose sizes decrease over successive generations 
because of recombination, and approaches have 
been developed to date admixture events by 
inferring the size of ancestry segments (/—5). 
Between-population frequency differences of 
individual alleles may provide information on 
ancestry sources (6, 7). On the basis of these prin- 
ciples, we developed an integrated approach by 
using genome-wide patterns of ancestry to infer 
jointly both fine-scale information about groups 
involved in admixture and its timing, allowing for 
the fact that migration and admixture events can 
occur at multiple times or involve numerous groups. 


The GLOBETROTTER Method 


Our approach gains power and resolution by 
using alleles at multiple successive single-nucleotide 
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polymorphisms (SNPs) (haplotypes) (8). Given a 
focal population within a larger data set contain- 
ing many such groups, the chromosomes of indi- 
viduals in this population share ancestors with 
those in other populations, resulting in shared 
“chunks” of DNA. We used CHROMOPAINTER 
(8) to decompose each chromosome as a series of 
haplotypic chunks, each inferred to be shared with 
an individual from one of the other groups and 
colored (or painted) by this group (Fig. 1B). If the 
focal population is admixed, the changing colors 
along a chromosome noisily reflect true but un- 
known underlying ancestry (Fig. 1B) and so can 
be used to learn details of the source group(s) 
involved. To do this, we modeled haplotypes 
within each unsampled source group as being 
found across a weighted mixture of sampled 
“donor” populations (9). If a source group is 


genetically relatively similar to a single sampled 
population, then this population will dominate 
the inferred mixture. If there is no close proxy for 
the admixing group in the sample, especially like- 
ly for ancient admixture events or sparsely sam- 
pled regions, several donor populations will be 
needed to approximate its pattern of haplotype 
sharing. The focal population is then automati- 
cally a haplotypic mixture of the combined do- 
nors, because it is a mixture of the source groups. 
Inferring the reduced set of groups within the mix- 
ture allows us to produce a “cleaned” painting 
(Fig. 1B) using only these groups. 

To assess the evidence for admixture and date 
events, informally we measured the scale at which 
the cleaned painting changes along the genome. 
Specifically, we produced a coancestry curve for 
each pair of donor populations, plotting genetic 


distance x against a measure of how often a pair 
of haplotype chunks separated by x come from 
each respective donor (Fig. 1C), analogously to 
ROLLOFF curves (4), and averaging over un- 
certain and typically computationally estimated 
haplotypic phase (9). In theory, given a single 
admixture event, ancestry chunks inherited from 
each source have an exponential size distribu- 
tion, resulting in an exponential decay of these 
coancestry curves (9). The rate of decay in all 
curves will be equal to the time in generations 
since admixture (Fig. 1C) (4, 9, 10), allowing 
estimation of this date: Steeper decay corresponds 
to older admixture. Such a decay distinguishes 
true admixture from ancient spatial structure and 
should only occur in recipient but not donor groups 
involved in nonreciprocal admixture events. We 
test for evidence to reject (P < 0.01) a no-admixture 
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Fig. 2. Overview of inferred admixture for 95 human populations. (A) 
Coancestry curve for the Maya for Spanish donor group (inferred as closest 
to minor admixing source), with green fitted line showing inferred expo- 
nential decay curve and a corresponding recent admixture date (with 95% 
Cls). (B and C) As (A), but showing the Druze and Kalash, respectively, with 
different indicated donors (donors indicated are proxies for minor admixing 
source, inferred as closest to Yoruba and Germany/Austria, respectively) and 
with successively older admixture. (D) On the map (locations approximate in 
densely sampled regions), shapes (see legend) indicate inference: no ad- 
mixture, a single admixture event, or more complex admixture. Colors 
indicate fineSTRUCTURE clustering into 18 clades (table $11 and figs. $12 
and $13). Inferred date(s) and 95% Cls are directly below the map, with two 
inferred admixing sources (dots and vertical bars) shown below each date 


(see example for simulation of Fig. 1 at left). For multiple admixture times, 
these two sources correspond to the more recent event; for multiple groups, 
they reflect the strongest admixture “direction.” Colored dots above each bar 
indicate clades best representing the major (top) and minor (bottom) 
sources. The bar is split at the inferred admixture fraction (horizontal line, 
fractions <5% shown as 5%). Each bar section indicates the inferred donor 
group haplotypic makeup, colored as the map, for one source. Shaded boxes 
on the inferred admixture times denote events referred to in the text, 
specifically (label 1) European colonization of the Americas (1492 CE to 
present, fuchsia); (2) Slavic (500 to 900 CE, pink) and Turkic (500 to 1100 
CE, maroon) migrations; (3) Arab slave trade (650 to 1900 CE, cyan); (4) 
Mongol empire (1206 to 1368 CE, purple); and (5) Khmer empire (802 to 
1431 CE, orange). 
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null model, that is, no exponential decay in (nor- 
malized) coancestry curves, via bootstrapping 
(9). Multiple admixture times result in a mix- 
ture of exponentials (9); if admixture is detected, 
we test for evidence of multiple admixture times 
(e.g., two episodes of admixture or more con- 
tinuous admixture over a longer period; empir- 


ical P < 0.05 in simulations), comparing the fit 
of a single exponential decay rate versus a mix- 
ture of rates. 

The curve heights (intercepts) provide com- 
plementary information to deconvolve the num- 
ber and genetic composition of the ancestral 
sources before admixture (//). Fitted curves for 
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all pairs of donor groups (Fig. 1C shows three 
examples) specify a pairwise intercept matrix, 
which, after normalization, we decompose by 
using a series of eigenvectors. Analogous to the 
standard use of eigenvector decomposition in 
principal components analysis (PCA) in genet- 
ics to estimate relative ancestry source contributions 


Fig. 3. Multiway admixture in Eastern Europe. Mixing per- 
centages (pie graphs) and dates (white text) inferred by using the 
strongest admixture “direction” for six eastern European groups— 
Belarus (BE), Bulgaria (BU), Hungary (HU), Lithuania (LI), Poland 
(PO), Romania (RO), analyzed when disallowing copying from near- 
by groups—and Greece (GR), analyzed by using the full set of 
94 donors. Mixing percentages indicate percentages for three 
geographic regions: “N. Europe” (Northwest Europe and East 
Europe from clades of table $11; blue), “Southern” (South Europe 
and West Asia; red), and “N.E. Asia” (Northeast Asia and Yakut; 
purple, also given above each pie), plus other (gray). All groups 
except Greece show evidence (P < 0.05) of multiway admixture in- 
volving sources along the approximate directions show by the arrows. 
Coancestry curves (black lines) for Bulgaria, fitted with an exponen- 
tial decay curve (green lines), exemplify this multiway signal. Each 
pairing of the three donor groups, each a proxy for the admixture 
source from a different region (Norway, northeast Europe; Orogen, 
Northeast Asia; and Greece, South Europe and West Asia), exhibits 
negative correlation (a dip) in ancestry weights at short genetic 
distances, implying at least three identifiably distinct ancestral 
sources mixing (approximately) simultaneously (9). 


Fig. 4. Ancient and modern admixture in Central Asia. (A) 
Dates (white text) and minority contributing sources for recent 
inferred events in nine populations (circles), analyzed disallow- 
ing copying from nearby groups, show contributions from 
Northeast Asia (purple) in the Hazara (HA), Uygur (UY), and 
Uzbekistani (UZ); East Asia (maroon) in Burusho (BU); West 
Asia (brown) in Pathan (PA); and Africa (red) in Balochi (BA), 
Brahui (BR), Makrani (MA), and Sindhi (SI). Kalash (KA, gray) 
have no inferred recent event. (B) Inferred mixing percentages 
(pie graphs) and dates (white text gives upper Cl bound) for ad- 
ditional, possibly shared, ancient events in seven groups (HA, UY, 
and UZ have no inferred ancient events). Pie graphs show inferred 
donor makeup of each group after removing the recent event 
contribution from (A), if any, with colors referring to donors from 
“East Asia” (Southeast Asia from clades of table $11; maroon), 
“Europe” (Northwest, East, and South Europe, fuchsia), “Central 
South Asia” (orange), “West Asia” (brown), and other (white). 
Arrows indicate “directions” of ancient admixture, with donor 
regions splitting into two pairs that represent different sources. 
Coancestry curves (black lines) for Sindhi are superimposed for 
two different donor pairs representing proxies for admixing 
groups with ancestry indicated by the solid circles, indicating 
highly different exponential decay rates fit as a mixture of 7 
and 94 generations (green lines). 
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for different individuals (/2), the eigenvectors 
allow us to estimate the relative contribution to 
different admixing sources (e.g., source | ver- 
sus source 2) for each different donor group (9). 
Also as for PCA, admixture between K distinct 
source populations produces K — 1 significant 
eigenvectors (/3), and we test for three or more 
admixing sources by testing (empirically) for evi- 
dence of two or more such eigenvectors (P< 0.05) 
(9). After iterative modeling to improve results, 
this allows us to attempt to “reverse” the admix- 
ture process (Fig. 1D) and to infer the haplotypic 
makeup of admixing source groups as well as 
admixture date(s) in our method, which we call 
GLOBETROTTER. 


Simulations 


To test our approach under diverse single, 
complex, and no-admixture scenarios, incorpo- 
rating many of the complexities (such as unsampled 
or admixed donor groups) likely to be present in 
real data, we simulated admixture scenarios in- 
volving real (but hidden to our analysis) human 
populations (4, 9) and populations generated un- 
der a coalescent framework (/4) incorporating 
inferred (/5—/8) past demographic events. Admix- 
ture was simulated between 7 and 160 generations 
[200 to 4400 years, assuming 28 years per human 
generation (/9)] ago, with admixture fractions 3 to 
50% and genetic differentiation (Fgr) between the 
admixing groups varying from 0.018 (similar to 
Europe versus Central Asia) to 0.185 (similar to 
West Africa versus Europe). Results are detailed 
online (figs. S3 to S7 and tables S1 and $5). All 
populations simulated without admixture, includ- 
ing those with long-term migration, showed no 
admixture evidence (P > 0.1). Power to detect 
admixture (P < 0.01) when present was 94%, and 
95% of our 95% bootstrapped confidence inter- 
vals (CIs) contained the true admixture date, 
including cases with two distinct incidents of 
admixture or multiple groups admixing simulta- 
neously. Inferred source accuracy was very high 
(9), with, for example, the mixture representation 
predicting a haplotype composition more corre- 
lated to the true, typically unsampled, source 
population than to any single sampled population 
>80% of the time. However, source accuracy was 
lower for admixing sources contributing only 5% 
of DNA, with around 40% of such scenarios 
yielding elevated (>25%) rates of falsely infer- 
ring multiple admixture times and/or admixing 
groups. Further testing demonstrated robustness 
of GLOBETROTTER, in simulations and real 
data, to haplotypic phase inference approach used, 
inclusion/exclusion of particular chromosomes, ge- 
netic map chosen to provide genetic distances, 
and the presence of population bottlenecks since 
admixture, whereas GLOBETROTTER admix- 
ture dating was improved relative to ROLLOFF 
(4, 9). 

Nevertheless, there are multiple settings that 
we believe are challenging for our approach. 
First, although the admixing sources need not be 
sampled—often impossible because of genetic 


drift, extinction, or later admixture into the sources 
themselves—source inference is improved when 
more similar extant groups are sampled, and 
GLOBETROTTER may miss events where we 
lack any extant group that can separate sources. 
Second, sampling of several genetically very sim- 
ilar groups can mask admixture events they share. 
Similarly, a caveat is that where genuine, recent 
bidirectional gene flow has occurred, admixture 
fractions are difficult to define and interpret. How- 
ever, date estimation is predicted to still be useful, 
and in real data the majority of our inferred events 
do not appear to be bidirectional in this manner. 
Third, even in theory our approach finds it chal- 
lenging to distinguish distinct continuous “pulses” 
of admixture and continuous migration over some 
time frame (9), because of the difficulty of sep- 
arating exponential mixtures (20). If the time 
frame were narrow, we expect to infer a single 
admixture time within the range of migration 
dates. Where we infer two admixture dates, in 
particular with the same source groups, the ex- 
ponential decay signal could also be consistent 
with more continuous migration, and so we con- 
servatively refer to this as admixture at multiple 
dates. Last, we only attempt to analyze popula- 
tions with signals consistent with at most three 
groups admixing and infer at most two admix- 
ture times, and we can provide only less precise 
inference of sources for the weaker or older ad- 
mixture signal in these complex cases (9). 


Analysis of Worldwide Admixture 


By using GLOBETROTTER, we analyzed 1490 
individuals from 95 worldwide human groups 
(table S10 and fig. S12) (9), composed of 17 newly 
genotyped groups (2/), 53 from the Human 
Genome Diversity Panel (HGDP) (22), and 25 
from other sources (23, 24), filtered to 474,491 
autosomal SNPs. We phased the individuals by 
using IMPUTE2 (9, 25) and used fineSTRUCTURE 
(8) to verify homogeneity within labeled pop- 
ulations, to identify genetically similar and clustered 
groups, and to remove outlying individuals (figs. 
S12 to S14 and tables S10 and S11). Of the 95 
populations, 80 showed evidence (P < 0.01) of 
admixture, although nine could not be charac- 
terized by our approach (table S12). More than 
half of these have evidence of multiple waves of 
admixture (P < 0.05), and estimated admixture 
times vary from <10 to >150 generations (Fig. 2). 
We present individual results, for each population, 
via an interactive map online (26). We tested con- 
sistency of our results against a previous analysis 
of the 53 groups within the HGDP (//), which 
identified 34 groups with evidence of recent ad- 
mixture. We identified (P < 0.01) admixture evi- 
dence in all 34 cases (with multiple event evidence 
in 15 cases) and obtained 95% admixture date CIs 
narrower than, but consistent with, those estimated 
by using ROLLOFF (9, //). For 10 of 19 HGDP 
groups lacking previous support for recent ad- 
mixture, GLOBETROTTER also identifies no 
events: In the remaining populations, admixture 
is inferred as occurring between genetically simi- 


lar sources (F'sy < 0.02), a challenging setting 
where simulations suggest our method is more 
powerful (9). 

In several instances, GLOBETROTTER clar- 
ifies or extends previous genetic analyses. For 
example, a previous study (27) inferred admix- 
ture in the Maya, with best source populations the 
Mozabites from North Africa and the Native 
American Surui, speculating on the basis of his- 
torical events that this might actually represent a 
mixture of European, West African, and Native 
American ancestry sources. GLOBETROTTER 
inferred admixture between three groups in the 
Maya dating to around 1670 CE (9 generations 
ago) (28) (Fig. 2, A and D, fuchsia box 1), with 
distinct sources from Europe (most genetically 
similar to the Spanish), West Africa (the Yoruba), 
and the Americas (the Pima, the nearest sampled 
group in the Americas). A different method, which 
aims to detect but not date admixture, concluded 
that Cambodians trace ~16% of their DNA to a 
group equally related to modern-day Europeans 
and East Asians (29). GLOBETROTTER infers a 
~19% contribution from a similar source related to 
modern-day Central, South, and East Asians and 
an ~81% contribution from a source related 
specifically to modern-day Han and Dai, the 
latter a branch of the Tai people who entered the 
region in historical times (30) (Fig. 2D, orange 
box 5). Further, this event dates to 1362 CE 
(1194 to 1502 CE), a period spanning the end of 
the Indianized Khmer empire (802 to 1431 CE) 
(30), one of the most powerful empires in South- 
east Asia, whose fall was hypothesized to relate 
to a Tai influx (30). 

A comparison with the historical record be- 
comes progressively more difficult for older epi- 
sodes. Even when events are well attested, their 
exact genetic impacts (if any) are rarely if ever 
known, motivating our approach. Nevertheless, 
we have identified nine groups of populations 
showing related events, incorporating almost all 
(19/20) with the strongest GLOBETROTTER 
admixture evidence (9). Results are presented as 
online maps (26). Some events appear to match 
well with particular historical occurrences, such 
as the so-called Bantu Expansion into Southern 
Africa (9). Events affecting a group of seven 
populations (Fig. 2D, purple box 4) correspond 
in time to the rapid expansion, initiated by 
Genghis Khan, of the Mongol empire (1206 to 
1368 CE) (37), one of the most dramatic events in 
human history. These populations, including the 
Hazara (32, 33), the Uygur (34), and the Mongola 
themselves, were sampled from within the range 
of the Mongol empire and show an admixture 
event dating within the Mongol Period, with one 
source closely genetically related to the Mongola 
that progressively decreases in proportion west- 
ward, to 8% in the Turkish (Fig. 2D). 

Seventeen populations from the Mediterranean, 
the Near East, and countries bordering the Arabian 
Sea (Fig. 2D, blue box 3) show signals of ad- 
mixture from sub-Saharan Africa, with most re- 
cent dates in the range 890 to 1754 CE (Fig. 2, B 
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and D). We interpret these signals, consistent with 
overlapping results of previous studies (4, 20), as 
resulting from the Arab expansion and slave trade, 
which originated around the seventh century CE 
(35). Our event dates are highly consistent with this 
but also imply earlier sub-Saharan African gene 
flow into, for example, the Moroccans. The highest- 
contributing sub-Saharan donor is West African 
for all 12 Mediterranean populations and an East 
or South African Bantu-speaking group for all 
five Arabian Sea populations (Fig. 2D), confirm- 
ing genetically different sources for these slave 
trades (35). 

A population group centered around Eastern 
Europe shows signals of complex admixture. 
FineSTRUCTURE did not fully separate groups 
from this region, suggesting masked shared events 
might be present. We therefore repainted them 
excluding each other as donors: We performed 
similar reanalyses of five additional geographic 
regions for the same reason (table S16 and figs. 
S16 to $21). The easterly Russians and Chuvash 
both show evidence (P < 0.05) of admixture at 
more than one time (Fig. 2D), at least partially 
predating the Mongol empire, between groups 
with ancestry related to Northeast Asians (e.g., 
the Orogen, Mongola, and Yakut) and Europeans, 
respectively (table S16). Six other European pop- 
ulations (Fig. 2D, pink/maroon box 2) indepen- 
dently show evidence after the repainting for 
similar admixture events involving more than 
two groups (P < 0.02) at about the same time 
(Fig. 3). CIs for the admixture time(s) overlap 
but predate the Mongol empire, with estimates 
from 440 to 1080 CE (Fig. 3). In each population, 
one source group has at least some ancestry re- 
lated to Northeast Asians, with ~2 to 4% of these 
groups’ total ancestry linking directly to East 
Asia. This signal might correspond to a small 
genetic legacy from invasions of peoples from 
the Asian steppes (e.g., the Huns, Magyar, and 
Bulgars) during the first millennium CE (36). 
The other two source groups appear much more 
local. One is more North European in the re- 
painting, when we exclude other East European 
groups as donors, and is largely replaced by north- 
ern Slavic-speaking groups in our original analysis 
(Fig. 2D and table $12). The other source is more 
southerly (e.g., Greeks and West Asians). This local 
migration could explain a recent observation of an 
excess of identity-by-descent sharing in Eastern 
Europe—including in the Greeks, in whom we 
infer admixture involving a group represented by 
Poland, at the same time—that was dated to a wide 
range between 1000 and 2000 years ago (37). We 
speculate that these events may correspond to the 
Slavic expansion across this region at a similar 
time, perhaps related to displacement caused by 
the Eurasian steppe invaders (38). 

Last, Central Asia shows a particularly com- 
plex inferred history after a reanalysis of 10 groups 
excluding each other as donors, with 9 of 10 groups 
showing diverse recent events (Fig. 4A). The ex- 
ception is the Kalash, a genetically isolated (39) 
population from the Hindu Kush mountains of 
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Pakistan (40). Distinct, ancient, and partially shared 
admixture signals (always dated older than 90 BCE) 
are seen in six groups (Fig. 4B), including the 
Kalash (Fig. 2C), whose strongest signal sug- 
gests a major admixture event (990 to 210 BCE) 
from a source related to present-day Western 
Eurasians, although we cannot identify the geo- 
graphic origin precisely. This period overlaps that 
of Alexander the Great (356 to 323 BCE), whose 
army, local tradition holds, the Kalash are de- 
scended from (40), but these ancient events pre- 
date recorded history in the region, precluding 
confident interpretation. 

Our results demonstrate that it is possible to 
elucidate the effect of ancient and modern mi- 
gration events and to provide fine-scale details of 
the sources involved, the complexity of events, 
and the timing of mixing of groups by using ge- 
netic information alone. Where independent in- 
formation exists from alternative historical or 
archaeological sources, our approach provides 
results consistent with known facts and deter- 
mines the amount of genetic material exchanged. 
In other cases, novel mixture events we infer are 
plausible and often involve geographically nearby 
sources, supporting their validity. Admixture events 
within the past several thousand years affect most 
human populations, and this needs to be taken into 
account in inferences aiming to look at the more 
distant history of our species. Future improve- 
ments in whole-genome sequencing, greater sam- 
ple sizes, and incorporation of ancient DNA, 
together with additional methodological exten- 
sions, are likely to allow better understanding of 
ancient events where little or no historical record 
exists, to identify many additional events, to infer 
sex biases, and to provide more precise event 
characterization than currently possible. We be- 
lieve our approach will extend naturally to these 
settings, as well as to other species. 
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Precise and Ultrafast Molecular 
Sieving Through Graphene 


Oxide Membranes 


R. K. 
A. K. Geim,?* R. R. Nair?* 


Joshi,” P. Carbone,” F. C. Wang,? V. G. Kravets,? Y. Su, I. V. Grigorieva,? H. A. Wu,? 


Graphene-based materials can have well-defined nanometer pores and can exhibit low frictional 
water flow inside them, making their properties of interest for filtration and separation. We 
investigate permeation through micrometer-thick laminates prepared by means of vacuum 
filtration of graphene oxide suspensions. The laminates are vacuum-tight in the dry state but, 

if immersed in water, act as molecular sieves, blocking all solutes with hydrated radii larger than 
4.5 angstroms. Smaller ions permeate through the membranes at rates thousands of times faster 
than what is expected for simple diffusion. We believe that this behavior is caused by a network 
of nanocapillaries that open up in the hydrated state and accept only species that fit in. The 
anomalously fast permeation is attributed to a capillary-like high pressure acting on ions inside 


graphene capillaries. 


pore sizes, especially in the angstrom range 

(/—-5), are of interest for use in separa- 
tion technologies (5—7). The observation of fast 
permeation of water through carbon nanotubes 
(8-10) and, more recently, through graphene- 
oxide (GO) laminates (//) has led to many pro- 
posals to use these materials for nanofiltration 
and desalination (S—/9). GO laminates are partic- 
ularly attractive because they are easy to fabricate 
and mechanically robust and should be amenable 
to industrial-scale production (20, 2/). They are 
made of impermeable functionalized graphene 
sheets that have a typical size of L ~ 1 um and the 
interlayer separation, d, sufficient to accommodate 
a mobile layer of water (//—25). Nanometer-thick 
GO films have recently been tried for pressure- 
driven filtration, revealing promising character- 
istics (15—/8). However, the results varied widely 
for different fabrication methods, and some ob- 
servations relevant to the present report (perme- 
ation of large molecules) are inconsistent with the 
known structure of GO laminates (20, 2/). This 
suggests the presence of cracks or pin holes in 
those GO thin films, which obscured their in- 
trinsic properties (25). 

We studied micrometer-thick GO membranes 
prepared from GO suspensions using vacuum fil- 
tration as described in (25). The resulting mem- 
branes were checked for their continuity by using 
a helium leak detector before and after filtration 
experiments, which demonstrated that the mem- 


P orous materials with a narrow distribution of 
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branes were vacuum-tight in the dry state (//). 
Schematics of our permeation experiments are 
shown in Fig. 1. The feed and permeate com- 
partments were initially filled to the same height 
with different liquids, including water, glycerol, 
toluene, ethanol, benzene, and dimethyl sulfoxide 
(DMSO). No permeation could be detected over 
a period of many weeks by monitoring liquid 
levels and using chemical analysis (25). If both 
compartments were filled with water solutions, 


permeation through the same vacuum-tight mem- 
brane could be readily observed as rapid changes 
in liquid levels (several millimeters per day). For 
example, a level of a 1 M sucrose solution in the 
feed compartment rose, whereas it fell in the per- 
meate compartment filled with deionized water. 
For a membrane with a thickness / of 1 um, we 
found water flow rates of 0.2 L m7 h!, and the 
speed increased with increasing the molar con- 
centration C. Because a 1 M sucrose solution cor- 
responds to an osmotic pressure of ~25 bar at 
room temperature (the van’t Hoff factor is 1 in 
this case), the flow rates agree with the evapora- 
tion rates of 10 L m~ bh! reported for similar 
membranes in (//), in which case, the permeation 
was driven by a capillary pressure of the order of 
1000 bar. The hydrostatic pressures in our experi- 
ments never exceeded 10 ~ bar and, therefore, 
could be neglected. 

We next investigated the possibility that dis- 
solved ions and molecules could diffuse through 
the capillaries simultaneously with water. We 
filled the feed compartment with various solu- 
tions to determine whether any of the solutes 
permeated into the deionized water on the other 
side of the GO membrane (Fig. 1B). As a quick 
test, ion transport can be probed by monitoring 
electrical conductivity of the permeate compart- 
ment (fig. S1). We found that for some salts (for 
example, NaCl), the conductivity increased with 
time, but for others {for example, K3[Fe(CN)<]}, it 
did not change over many days of measurements. 

Depending on the solute, we used ion chro- 
matography, inductively coupled plasma optical 
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Fig. 1. lon permeation through GO laminates. (A) Photograph of a GO membrane covering a 1-cm 
opening in a copper foil. (B) Schematic of the experimental setup. A U-shaped tube 2.5 cm in diameter is 
divided by the GO membrane into two compartments referred to as feed and permeate. Each is filled to a 
typical level of ~20 cm. Magnetic stirring is used so as to ensure no concentration gradients. (C) Permeation 
through a 5-um-thick GO membrane from the feed compartment with a 0.2 M solution of MgCls. (Inset) 
Permeation rates as a function of C in the feed solution. Within our experimental accuracy (variations by a 
factor of <40% for membranes prepared from different GO suspensions), chloride rates were found the 
same for MgClz, KCl, and CuCl. Dotted lines are linear fits. 
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emission spectrometry, total organic carbon anal- 
ysis, and optical absorption spectroscopy (25) to 
measure permeation rates for a range of mole- 
cules and ions (table S1). An example of our mea- 
surements for MgCl, is shown in Fig. 1C, using ion 
chromatography and inductively coupled plasma 
optical emission spectrometry for Mg?*and CI, 
respectively. Concentrations of Mg?* and Cl in 
the permeate compartment increased linearly 
with time, as expected. Slopes of such curves 
yield permeation rates. The observed rates de- 
pend linearly on concentration in the feed com- 
partment (Fig. 1C, inset). Cations and anions move 


10! 


through membranes in stoichiometric amounts so 
that charge neutrality within each of the com- 
partment is preserved. For example, in Fig. 1C 
permeation of one Mg”* ion is accompanied by 
two CI ions, and there is no electric field buildup 
across the membrane. 

Our results obtained for different ionic and 
molecular solutions are summarized in Fig. 2. 
The small species permeate with approximately 
the same speed, whereas large ions and organic 
molecules exhibit no detectable permeation. The 
effective volume occupied by an ion in water is 
characterized by its hydrated radius. If plotted 
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Fig. 2. Sieving through the atomic-scale mesh. The shown permeation rates are normalized per 1M 
feed solution and measured by using 5-um-thick membranes. Some of the tested chemicals are named 
here; the others can be found in table $1 (25). No permeation could be detected for the solutes shown 
within the gray area during measurements lasting for at least 10 days. The thick arrows indicate our 
detection limit, which depends on a solute. Several other large molecules—including benzoic acid, DMSO, 
and toluene—were also tested and exhibited no detectable permeation. The dashed curve is a guide to 
the eye, showing an exponentially sharp cutoff at 4.5 A, with a width of =0.1 A. 


Fig. 3. Simulations of molec- 
ular sieving. (A) Snapshot of NaCl 
diffusion through a 9 A graphene 
slit allowing two layers of water. 
Na* and CI ions are in yellow and 
blue, respectively. (B) Permeation 
rates for NaCl, CuCl, MgCl;, pro- 
panol, toluene, and octanol for 
such capillaries. For octanol poorly 
dissolved in water, the hydrated 
radius is not known, and we use its 
molecular radius. Blue marks in- 
dicate permeation cutoff for an 


atomic cluster (inset) for graphene capillaries accommodating two and three layers of water (widths of 


9 and 13 A, respectively). 
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as a function of this parameter, our data are well 
described by a single-valued function with a 
sharp cutoff at ~4.5 A (Fig. 2). Species larger than 
this are sieved out. This behavior corresponds 
to a physical size of the mesh of 9 A. Also, 
permeation rates do not exhibit any notable de- 
pendence on ion charge (Fig. 2) (12, 13, 23, 26) 
because triply charged ions such as AsO,°- per- 
meate with approximately the same rate as singly 
charged Na* or Cl. Last, to prove the essential 
role of water for ion permeation through GO lami- 
nates, we dissolved KCl and CuSO, in DMSO, 
the polar nature of which allows solubility of these 
salts. No permeation was detected, confirming the 
special affinity of GO laminates to water. 

To explain the observed sieving properties, 
we use the model previously suggested to account 
for unimpeded evaporation of water through GO 
membranes (//). Individual GO crystallites have 
two types of regions: functionalized (oxidized) 
and pristine (27, 27, 28). The former regions act 
as spacers that keep adjacent crystallites apart 
and also prevent them from being dissolved. Ina 
hydrated state, the spacers help water to interca- 
late between GO sheets, whereas pristine regions 
provide a network of capillaries that allow nearly 
frictionless flow of a layer of correlated water, 
similar to the case of water transport through 
carbon nanotubes (S—/0). The earlier experi- 
ments using GO laminates in air (typical d ~ 
9 + 1 A) were explained by assuming one mono- 
layer of moving water. For GO laminates soaked 
in water, d increases to ¥13 + 1 A (fig. S2), which 
allows two or three water layers (/9, 22, 23, 29). 
Taking into account the effective thickness of 
graphene of 3.4 A (interlayer distance in graph- 
ite), this yields a pore size of ~9 to 10 A, 
which is in agreement with the mesh size found 
experimentally. 

To support our model, we have used molec- 
ular dynamics (MD) simulations. The setup is 
shown in Fig. 3A, in which a graphene capil- 
lary separates feed and permeate reservoirs, 
and its width is varied between 7 and 13 A to 
account for the possibility of one, two, or three 
layers of water (25). We find that the narrowest 
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capillaries become filled with a monolayer of 
water as described previously (//) and do not 
allow even such small ions as Na’ and CI inside. 
However, for two and three layers expected in 
the fully hydrated state (25) ions enter the capil- 
laries and diffuse into the permeate reservoir. 
Their diffusion rates are found to be approx- 
imately the same for all small ions and show little 
dependence on ionic charge (Fig. 3B). Larger 
species (toluene and octanol) cannot permeate 
even through capillaries containing three layers 
of water (fig. S3). We have also modeled large 
solutes as atomic clusters of different size (25) 
and found that the capillaries accommodating 
two and three layers of water rejects clusters with 
the radius larger than ~4.7 and 5.8 A, respec- 
tively. This may indicate that the ion permeation 
through GO laminates is limited by regions con- 
taining two layers of water. The experimental 
and theory results in Figs. 2 and 3B show good 
agreement. 

Following (//), we estimate that for our lami- 
nates with h ~ 5 um and L ~ 1 um, the effective 
length of graphene capillaries is L x h/d ~ 5 mm 
and that they occupy d/L ~ 0.1% of the sur- 
face area. This estimate is supported by mea- 
suring the volume of absorbed water, which is 
found to match the model predictions (25). For 
a typical diffusion coefficient of ions in water 
(10° cm* s_'), the expected diffusion rate 
for a 1 M solution through GO membrane is 
210° mol h! m? (25)—that is, several thou- 
sands of times smaller than the rates observed 
experimentally (Fig. 1C). Such fast transport 
cannot be explained by the confinement, which 
increases the diffusion coefficient by only a fac- 
tor of 3/2, reflecting the change from bulk to 
two-dimensional water. Moreover, functionalized 
regions [modeled as graphene with randomly at- 
tached epoxy and hydroxyl groups (20, 21)] 
do not enhance diffusion but rather suppress it 
(25, 29) as expected because of the broken trans- 
lational symmetry. 

To understand the ultrafast ion permeation, 
we recall that graphite-oxide powders exhibit 
extremely high absorption efficiency with re- 
spect to many salts (30). Despite being densely 
stacked, our GO laminates are found to retain 
this property for salts with small hydrated radii 
[(25), section 6]. Our experiments show that per- 
meating salts are absorbed in amounts reaching 
as much as 25% of the membrane’s initial weight 
(fig. S2). The large intake implies highly concen- 
trated solutions inside graphene capillaries (close 
to the saturation). Our MD simulations confirm 
that small ions prefer to reside inside capillaries 
(fig. S4). The affinity of salts to graphene capil- 
laries indicates an energy gain with respect to the 
bulk water, and this translates into a capillary-like 
pressure that acts on ions within a water medium 
(25). Therefore, there is a large capillary force, 
sucking small ions inside GO laminates and fa- 
cilitating their permeation. Our MD simulations 
provide an estimate for this ionic pressure as 
>50 bars (25). 


The reported GO membranes exhibit extra- 
ordinary separation properties, and their full 
understanding will require further work both ex- 
perimental and theoretical. With the ultrafast ion 
transport and atomic-scale pores, GO membranes 
already present an interesting material to consider 
for separation and filtration technologies, par- 
ticularly those that target extraction of valuable 
solutes from complex mixtures. By avoiding the 
swelling of GO laminates in water (by using me- 
chanical constraints or chemical binding), it may 
be possible to reduce the mesh size down to ~6 A; 
in which case, one monolayer of water would still 
go through, but even the smallest salts would be 
rejected. 
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Designing Collective Behavior 
in a Termite-Inspired Robot 


Construction Team 


Justin Werfel,?* Kirstin Petersen,”? Radhika Nagpal” 


Complex systems are characterized by many independent components whose low-level actions produce 
collective high-level results. Predicting high-level results given low-level rules is a key open challenge; 
the inverse problem, finding low-level rules that give specific outcomes. We present a multi-agent 
construction system inspired by mound-building termites, solving such an inverse problem. A user 
specifies a desired structure, and the system automatically generates low-level rules for independent 
climbing robots that guarantee production of that structure. Robots use only local sensing and 
coordinate their activity via the shared environment. We demonstrate the approach via a physical 
realization with three autonomous climbing robots limited to onboard sensing. This work advances the 
aim of engineering complex systems that achieve specific human-designed goals. 


mentation that characterize human construc- 


I: contrast to the careful preplanning and regi- 
tion projects, animals that build in groups do 
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so in a reactive and decentralized way. The most 
striking examples are mound-building termites, 
colonies of which comprise millions of indepen- 
dently behaving insects that build intricate struc- 
tures orders of magnitude larger than themselves 
(/, 2) (Fig. 1, A and B). These natural systems 
inspire us to envision artificial ones operating via 
similar principles (3, 4), with independent agents 
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acting together to build elaborate large-scale 
structures, guided by reacting to the local situa- 
tions they encounter. Such systems could enable 
construction in settings where human presence 
is dangerous or problematic, as in disaster areas 
or extraterrestrial environments. 

Engineering an automated construction sys- 
tem that operates by termite-like principles rather 
than human-like ones requires an ability to design 
complex systems with desired collective behavior 
(e.g., producing a particular user-specified build- 
ing). The hallmark of complex systems of inde- 
pendent agents (5—7) is unexpected collective 
behavior that emerges from their joint actions, 
not readily predictable from knowledge of agent 
rules. If a specific collective behavior is desired, 
no method in general is known to find agent rules 
that will produce it. 

We present a decentralized multi-agent sys- 
tem for automated construction of user-specified 
structures, thereby providing a solution to such a 
problem of complex system design. An arbitrary 
number of independent robots follow an iden- 
tical set of simple, local rules that collectively 
produce a specific structure requested by a user 
(Fig. 1, C and D). The rules are automatically 
generated from a high-level representation of the 
final target structure and provide provable guar- 
antees of correct completion of that structure. The 
challenges associated with engineering a complex 
system are addressed by using principles drawn 
from social insects—in particular, indirect coor- 
dination through manipulation and sensing of a 
shared environment (stigmergy), and behavioral 
regularities that constrain the space of possible 
outcomes—which together make analysis and 
execution tractable. We first present the theoretical 
foundation for this work, followed by a physical 


implementation with three independent robots 
demonstrating autonomous construction. 

The independence of individual robots stands 
in contrast to other work on automating construc- 
tion with single (S—10) or multiple (//—/4) robots 
with centralized sensing and/or control. Central- 
ized systems that provide a global computing 
authority and/or precise positioning information 
during run time, in settings where such features 
are feasible, can have advantages in aspects such 
as efficiency and run-time flexibility. Conversely, 
decentralization provides advantages including op- 
portunities for greater scalability (no coordinating 
authority that can become overloaded) and robust- 
ness (no single point of failure). 

We distinguish between two types of build- 
ing processes (Fig. 2). A system may produce a 
predetermined outcome, in which many possible 
system trajectories all lead to the same guaran- 
teed final state. Alternatively, variation during the 
process may lead to a variable outcome, in which 
the final state is determined during the course of 
construction and can change if the process is 
rerun. In the context of human construction, sin- 
gle buildings are built via the first type of pro- 
cess, in which the order of operations might vary 
but the final result always matches a blueprint; 
cities develop via the second type of process, in 
which choices are contingent on previous deci- 
sions such that many distinct results are possible. 
Here, we focus on designing processes with fixed 
outcomes, but also show how our system can be 
used to generate structures that vary each time 
robots construct them. 

Our system design is motivated by the goal 
of relatively simple, independent robots with 
limited capabilities (75), able to autonomously 
build a large class of nontrivial structures using 
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a single type of prefabricated building material 
(solid “‘bricks’’). We require a robot to be able to 
move forward, move back, and turn in place; 
climb up or down a step the height of one brick; 
and pick up one brick, move while carrying it, 
and attach it directly in front of itself at its own 
level. Robots can build staircases of bricks to 
climb to higher levels. Robots are limited to local 
sensing, able to perceive only bricks and other 
robots in their immediate vicinity. Information 
about the current state of the overall structure and 
the actions of more distant robots is not avail- 
able. Robots obtain information about where bricks 
have been attached only through direct inspec- 
tion; after they leave an area, this information is 
liable to become outdated as other robots modify 
the structure. The structure, built from square bricks 
in a nonoverlapping grid pattern, provides a ref- 
erence that robots can use to keep track of their 
relative movement around it. A single “seed” 
brick, the initiation point from which the contig- 
uous structure is built, provides a unique landmark. 

We take an approach derived from the classi- 
cally insect-inspired notion of stigmergy (2, 3, 16), 
in which, instead of any explicit broadcast or one- 
to-one communication between agents, all com- 
munication is implicit via the joint manipulation 
of a shared environment. In particular, we focus 
on qualitative stigmergy (2) in which actions are 
triggered by qualitatively different stimuli, such 
as distinct arrangements of building material. 
Robots in our system add bricks to the structure 
in response to existing configurations of bricks. 
In doing so, the rules they follow must be con- 
structed in such a way that correct completion of 
the target structure is guaranteed, despite stale 
information about other parts of the structure, and 
irrespective of the (potentially variable) number 


Fig. 1. Natural and artificial collective construction. (A and B) Complex 
meter-scale termite mounds (A) are built by millimeter-scale insects (B), which act 
independently with local sensing and limited information. (C) Physical implemen- 
tation of our system, with independent climbing robots that build using specialized 
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bricks. (D) System overview for building a specific predetermined result (Fig. 2, A 
and C): A user specifies a desired final structure; an offline compiler converts it to a 
“structpath” representation (Fig. 3), which is provided to all robots; robots follow 
local rules that guarantee correct completion of the target structure (movie $1). 
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of other robots and the order and timing of their 
own actions. The fact that all robots follow the 
same rules (/7)—as all insects in the same colony 
follow consistent behavior patterns, or other ani- 
mals obey intraspecific social conventions (/8)— 
helps to constrain outcomes and restricts the space 
of possible situations that robots typically en- 
counter and must be able to handle. 

To ensure a predetermined outcome, robot 
tules are based on sensed brick configurations 
plus a static internal representation of the target 
structure. A user provides a picture or other high- 
level representation of the desired final structure, 
specifying what sites are ultimately meant to be 
occupied by bricks. An offline compilation step 
converts this to a “structpath” representation that 
provides movement guidelines for robots at each 
location—in a sense, traffic laws appropriate for 
that structure (79) (Fig. 3). Robots then follow a 
fixed set of simple rules (/9), referring to the static 
structpath representation and otherwise identical 
for any target structure; these ensure the growth 
of that structure to completion in a way consist- 
ent with robot capabilities (movie S1). The rules 
rely on locally available information, preserve a 


A Cc 


robot’s ability to move freely over the structure 
and opportunities for parallelism, and prevent 
deadlocks and other situations where the actions 
of one robot interfere with those of another. 
Robots act reactively (20, 21); they do not preplan 
their actions, as is appropriate in this decen- 
tralized multirobot approach in which a robot 
setting out to perform a specific task might find 
it already completed by the time it gets there. 
The structpath representation specifies a set 
of paths that robots can follow through the struc- 
ture layout while respecting their movement 
constraints. In particular, all paths start from the 
seed and require climbing up or down a height 
of at most one brick at a time. The structpath 
specifies a fixed direction for robots to travel be- 
tween each pair of neighboring sites; off the 
structure, robots follow its perimeter strictly counter- 
clockwise. This directional restriction smooths 
traffic flow, ensures a flow of material into the 
growing structure (avoiding excessive backtrack- 
ing from, e.g., laden robots making way for un- 
laden ones to exit), and allows regularities in 
structure growth that let local rules ensure the 
preservation of global invariants. Paths may split 


and merge; a robot may have more than one 
way of leaving or entering a site. A multiplicity 
of possible paths helps the system exploit the 
parallelism of the swarm. The compiler performs 
a recursive search to identify a set of paths with- 
out cycles that meets these requirements, or to 
determine that none exists. 

Individual robots then repeat the following 
routine: With a brick, circle the structure perim- 
eter until reaching the seed; climb onto the struc- 
ture and move along any legal path, keeping 
track of relative position with respect to the seed; 
attach the brick at any vacant site whose local 
neighborhood satisfies a fixed set of geometric 
requirements (/9); continue to follow the path off 
the structure; obtain a new brick. These rules can 
be shown to guarantee successful completion of 
the target structure while ensuring that no inter- 
mediate state calls for a robot to perform tasks 
beyond its capabilities—in particular, climb or 
descend a height of greater than one brick, attach 
a brick at a higher or lower level than itself, or 
force a brick into place directly between two 
others (a mechanically difficult operation re- 
quiring high precision) (/9). Direct interaction 


Fig. 2. Two types of building process. (A) Different possible sequences all 
lead to the same predetermined endpoint. (B) Different sequences lead to 
different results, which are determined only during the course of construction. 
(C) Example of the first type of process, building a step pyramid modeled after 
the main temple at Chichen Itza [photo by Kyle Simourd, CC BY 2.0]. Upper 
and lower panels show snapshots from different possible sequences (movie 


$2), at approximately 10%, 25%, 50%, 67%, and 80% completion of the 
common final structure. (D) Example of the second type of process, building a 
set of one-brick-high ramifying paths. (E) A hybrid system can combine 
elements of both types; in this example, paths of stochastically determined 
lengths lead to buildings chosen randomly from a set of predefined structures. 
See (19) for details of agent rules for all cases. 
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between robots is limited to each one yielding 
to the one ahead of it in this physical loop. 
Because robots may take multiple possible 
paths through a structure, the ordering of the build- 
ing process can occur in many different ways. 


Fig. 3. Target struc- 
tures and correspond- 
ing structpaths. For each 
predefined target structure 
at left, the corresponding 
structpath representation 
at right is generated by 
the offline compiler (29). 
From top to bottom: a sim- 
ple structure with a unique 
structpath if the seed lo- 
cation is given; the temple 
of Fig. 2C, showing one of 
many possible structpaths; 
a structure enclosing inter- 
nal courtyards. Sites in the 
structpath are shaded ac- 
cording to height (darker = 
higher); a dot marks the 
seed brick. Directions are 
color-coded to clarify flows 
(red, left; blue, right; green, 
up; yellow, down). 
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Accordingly, the structure will emerge in differ- 
ent ways in different instances of building with 
the same structpath, with intermediate structures 
that may be observed in one instance but not 
another; however, the agent rules guide the 


& 


Fig. 4. Hardware demonstration. Independent autonomous robots with purely onboard sensing collectively work on prespecified structures. (A) A castle-like 
structure (movie $3). (B) A sequence of overhead snapshots building a branching structure (movie $4). 
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process to always end in the same final structure 
(movie S82). 

In addition to this approach for producing 
predetermined structures, the same robots can use 
different local rules to build structures whose de- 
tailed form emerges from the construction pro- 
cess. Multiple structures built with the same 
rules share qualitative features but differ in de- 
tail. Such a rule set could, for example, be used to 
generate a randomized street layout for a building 
complex. Figure 2E shows an example of a hy- 
brid system built by such a rule set (79), where 
buildings chosen randomly from a set of pre- 
defined types are positioned at the ends of lanes 
of stochastically determined lengths. The robots 
again use stigmergy to coordinate their actions; 
for example, particular configurations of bricks 
constitute cues to agree on which building type 
should be constructed at the end of a given lane. 

To demonstrate the feasibility of such a de- 
centralized multirobot construction system, we 
present a proof-of-concept implementation in hard- 
ware (/9) (Fig. 1C and Fig. 4). Design choices 
were driven by the requisite primitive operations 
that robots must perform: pick up a brick from a 
cache; attach a brick directly in front of them- 
selves; detect nearby robots; when on the struc- 
ture, move forward one site (while staying at the 
same level or climbing up or down one brick) 
or turn in place 90° left or right; when off the 
structure, circle its perimeter. For locomotion, 
we equipped robots with whegs [hybrid wheel- 
legs (22)], chosen for their empirical effective- 
ness in climbing (23). Each robot is equipped with 
seven active infrared sensors to detect black-and- 
white patterns on the bricks and ground for 
navigation; an accelerometer to register tilt angle 
for climbing and descent; an arm to lift and lower 
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bricks, with a spring-loaded gripper to hold them 
securely while carried; and five ultrasound sonar 
units that let each robot evaluate and maintain its 
distance from the structure perimeter, as well as 
detect other sonar-using robots nearby. The ro- 
bot footprint (17.5 cm = 11.0 cm) is smaller 
than that of the bricks, facilitating their maneu- 
verability atop a wall one brick wide. Bricks 
(21.5 cm x 21.5 cm x 4.5 cm) are made from 
expanded urethane foam, with physical features 
to achieve self-alignment and neodymium mag- 
nets for attachment. 

This hardware system demonstrates multiple 
simultaneously active, independent robots exe- 
cuting the full algorithm with entirely on-board 
sensing. supplementary movies show fully autono- 
mous robots working on different user-specified 
structures (Fig. 4 and movies S3 and S4), adding 
bricks both atop the structure and on the ground, 
climbing over the structure as they build it, and 
adapting to one another’s presence and actions, 
without human intervention beyond reloading the 
brick cache. The reactive nature of the approach 
can be demonstrated via extemally imposed changes 
made to a structure while robots work on it 
(movie S5). Many approaches to mobile robotics 
deal with the continuous, noisy real world by 
probabilistically modeling its uncertainty (24); 
our system instead uses carefully engineered hard- 
ware to effectively discretize robot actions on the 
structure. Sensor feedback and brick features 
matched to the robots allow reasonable reliability 
with simple control (79). Minor deviations from 
ideal behavior are corrected by compensatory 
routines and/or passive mechanical features; for 
example, drift in position atop a structure is re- 
duced both by robots checking their pose with 
respect to the brick markings, and by indentations 
on brick upper faces that guide the robots to 
stay within tolerance. Although extending this 
research prototype to a full production system 
would require solving many additional engineer- 
ing challenges (19), our work demonstrates that 
physical hardware can allow the discretized theory 
to sufficiently represent the continuous reality. 

This work provides an example of an 
engineered complex system, with multiple au- 
tonomous robots following simple, local rules 
and collectively achieving a specific desired 
result. Tools drawn from the social insects that 
inspire our approach—the exploitation of regu- 
larities that arise from identical programming in 
multi-agent systems, and the use of the envi- 
ronment as a means of implicit coordination— 
make these results possible. Future progress in 
our ability to design complex systems will advance 
our capacity to engineer systems that work as 
nature does (25—27), with large numbers of func- 
tionally limited, interchangeable parts, individ- 
ually unreliable, collectively robust. 
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High-Energy Surface X-ray 
Diffraction for Fast Surface 
Structure Determination 


J. Gustafson,** M. Shipilin,? C. Zhang,’ A. Stierle,”? U. Hejral,”’> U. Ruett,? 0. Gutowski,” 


P.-A. Carlsson,* M. Skoglundh,* E. Lundgren* 


Understanding the interaction between surfaces and their surroundings is crucial in many 
materials-science fields, such as catalysis, corrosion, and thin-film electronics, but existing 
characterization methods have not been capable of fully determining the structure of surfaces 
during dynamic processes, such as catalytic reactions, in a reasonable time frame. We 
demonstrate an x-ray-diffraction—based characterization method that uses high-energy photons 
(85 kiloelectron volts) to provide unexpected gains in data acquisition speed by several orders 
of magnitude and enables structural determinations of surfaces on time scales suitable for in situ 
studies. We illustrate the potential of high-energy surface x-ray diffraction by determining the 
structure of a palladium surface in situ during catalytic carbon monoxide oxidation and follow 
dynamic restructuring of the surface with subsecond time resolution. 


nderstanding solid surfaces and their in- 
teractions with their surroundings has been 
a major research field for decades, moti- 
vated by important areas such as catalysis, cor- 
rosion, nanotechnology, and thin-film electronics 
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(J-6). Therefore, a large number of experimental 
techniques with high surface sensitivity have been 
developed. However, many of these techniques, 
such as low-energy electron diffraction (LEED), 
x-ray photoelectron spectroscopy, and low-energy 
ion scattering (/—3), gain surface sensitivity from 
the limited mean-free path of the electrons or ions 
used as probes. Hence, these methods typically 
need ultra-high vacuum (UHV) conditions, and 
the use of these techniques to study surface struc- 
tures under near-ambient conditions requires sev- 
eral stages of differential pumping. 
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In contrast, surface x-ray diffraction (SXRD) 
(7-9) is a photon-in—photon-out technique, and 
because x-rays interact weakly with matter, the 
method is much less influenced by any gas sur- 
rounding the sample. When x-rays interact with 
a crystalline material, the scattered x-rays form 
a diffraction pattern, which provides structural 
information about the crystal’s reciprocal lattice— 
the Fourier transform of the real-space atomic 
lattice—and specifically the periodicity (and thus 
indirectly the positions) of the atoms. The recip- 
rocal lattice will be dominated by Bragg reflec- 
tions from diffraction in the bulk of the crystal. 
Because of the broken periodicity at the surface, 
so-called crystal truncation rods (CTRs) connect 
the Bragg reflections perpendicular to the sur- 
face. In case the in-plane periodicity at the surface 
differs from that of the underlying bulk, additional 
superstructure rods arise in reciprocal space. The 
shapes of these CTRs and superstructure rods hold 
detailed information about the atomic surface 
structure, but the weak interaction of x-rays with 
matter requires intense synchrotron radiation to 
detect the surface signal. Thus, SXRD is one of 
very few methods available for surface structure 
determination under ambient conditions. 

A serious drawback of conventional SXRD, 
with x-rays in the range of 10 to 30 keV and a 
point or small two-dimensional (2D) detector, is 
the limited amount of data that can be acquired 
in a reasonable time frame. Exploring 2D maps 
from a substantial part of reciprocal space is ex- 
tremely time-consuming, and mapping of the 3D 
reciprocal space with high resolution is currently 
impossible even with synchrotron radiation. As 
a result, the probed surface structure has to be 
known qualitatively from other measurements, 
and an unexpected structure may easily be left 
unnoticed, especially under harsh conditions. Fur- 
thermore, obtaining a quantitative data set (from 
an already qualitatively known structure) takes on 
the order of 10 hours with traditional use of SXRD. 

We demonstrate how the use of high-energy 
x-rays (85 keV) in combination with a large 2D 
detector accelerates the data collection by several 


Fig. 1. Interpretation of 

the HESXRD patterns. (A) A 
Illustration of how CTRs from 
a clean Pd(100) surface cross 
the Ewald sphere during sam- 
ple rotation (see also movie 
$1). h, k, and / are the reciprocal 
lattice vectors defined such 
that A and k are in the sur- 
face plane, / is perpendicular 
to the surface plane, and the 
Bragg reflections of the sub- 
strate appear at integer values. 
(B) The corresponding detec- 
tor image in which the Bragg 


k(RLU) 


reflections from the Pd substrate are indexed, and the points where CTRs cross 
the Ewald sphere are marked with arrows. The dark rectangles at the Pd Bragg 
reflections stem from absorbers protecting the detector. The reciprocal lattice 
vectors are defined as la*l = Ib*l = 2n/dy19; Ie*l = 27/ap with ap = 3.89 A (Pd 
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orders of magnitude and enables full surface- 
structure determination by 3D mapping of re- 
ciprocal space on a time scale suitable for in situ 
studies. In addition, the small diffraction angles, 
resulting from the high photon energy, and the 
large detector result in data that are easily presented 
in a more intuitive way, because each detector im- 
age contains the projection of a full plane in re- 
ciprocal space and straight lines in reciprocal 
space correspond to straight lines on the detector. 
We provide a proof-of-principle demonstration of 
high-energy SXRD (HESXRD) by performing a 
full determination of the surface structure during 
catalytic CO oxidation over Pd(100). However, our 
method is general and also applicable to other 
surface- or interface-related in situ studies. 
Figure 1A shows a schematic map of the re- 
ciprocal lattice of the Pd(100) crystal surface used 
in our study, including Bragg reflections (green 
spots) as well as CTRs (green lines). To probe a cer- 
tain point in the reciprocal lattice, it is necessary 
to (i) rotate the sample so that the so-called Ewald 
sphere—the spherical surface in reciprocal space, 
which is probed at a given orientation between the 
sample and the beam—aintersects this point and (ii) 
place the detector so that the correspondingly scat- 
tered x-rays are detected. Using traditional photon 
energies (10 to 30 keV) necessitates moving the 
detector around the sample, while only being able 
to probe a small part of reciprocal space simulta- 
neously. For higher photon energies, however, the 
size of the Ewald sphere increases and the scatter- 
ing angles decrease. Hence, the use of high-energy 
x-rays enables the detection of large volumes of 
reciprocal space with a stationary 2D detector. 
The section of the Ewald sphere that is cov- 
ered by the detector in our experiment is shown 
schematically in Fig. 1A. The red circles indicate 
points where a CTR intersects the Ewald sphere 
and where the intensity of scattered x-rays is 
recorded. By using an incidence angle close to 
the critical angle for total external reflection (0.04° 
in this case), the surface signal is maximized. The 
corresponding detector image is shown in Fig. 1B, 
where the two CTRs marked in Fig. 1A are in- 
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dicated by arrows. Also seen in the detector image 
are the shadows of tungsten pieces placed in front 
of the detector to protect it from the high-intensity 
Bragg reflections (about 7 orders of magnitude 
more intense than the CTRs). The lack of intensity 
in the bottom center and the top corners of the 
detector image is due to a mask that prevented 
x-rays scattered in the walls of the experimen- 
tal chamber to reach the detector (/0). 

To map out the reciprocal space, the sample 
(and hence the reciprocal lattice) is rotated around 
the surface normal, such that the Ewald sphere 
scans through the reciprocal space as illustrated 
in movie S1. The result is a 3D data set of the 
diffraction from the Pd(100) surface, and the scan 
takes on the order of 10 min. With HESXRD, we 
performed a full surface structure determination 
in situ during catalytic CO oxidation over Pd(100), 
which has not been possible before. In a specially 
designed SXRD flow reactor (//), the sample was 
exposed to a total gas pressure of 75 Torr at 600 K. 
In the gas mixture, a flow of 2 ml/min of O, and 
4 ml/min of CO was set using Ar as carrier gas, 
resulting in partial pressures of O2 and CO of ~3 
and 6 Torr, respectively. Under these conditions, 
the Pd(100) surface is highly catalytically active, 
converting all of the CO reaching the surface to 
CO). During the reaction, HESXRD data were col- 
lected by a sample rotation (as described above) 
over 90°, and the diffraction intensities were col- 
lected every 0.1° with an exposure time of 1 s. The 
data show a surface with a (/5 by /5) rotated 
(R) 27° surface oxide structure (henceforth de- 
noted \/5), consisting of a single PdO(101) plane 
on top of the Pd(100) surface (Fig. 2A) (72, 13). 

The data can be visualized in different ways. 
In Fig. 2B, we show the result of combining 900 
detector images to render the maximum-intensity 
result for each pixel. Such a projection onto the 
rotational plane provides a direct view of all the 
CTRs (indexed below the image) and super- 
structure rods (indexed above) within the probed 
volume. One can immediately identify the dif- 
ferent structures present at the surface and draw 
some qualitative conclusions. For instance, the 
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bulk lattice constant); dy39 = a/\/2. a* points along the face-centered cubic 
bulk [011] direction, b* along [0,—1,1], c* along [100], perpendicular to the 
surface. a*, b*, and c* are the basis vectors spanning the reciprocal lattice, i.e., 
defining the direction and units of the h, k, and | axes. 
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x-ray scattering intensity corresponding to the 
superlattice rods contains no Bragg reflections, 
indicating that the oxide film is very thin. 

To allow comparison of the results obtained by 
HESXRD with those expected from LEED, which 
is the most common surface diffraction method 
used under UHV conditions, we show a slice in 
the Hk-plane of the HESXRD data around /= 0.5 
reciprocal lattice unit (RLU) (Fig. 2C). Projected 
onto the data is a map of the reciprocal lattice cor- 
responding to the Pd(100) substrate (green squares) 
and the surface oxide (red dots and circles). A 
zoom-in (Fig. 2D) shows the spots at (/, k) = (0.4, 
—0.8), (0.6, —0.8), (0.8, —0.6), and (0.8, —0.4), cor- 
responding to the periodicity of the V5 structure. 
It is, however, apparent that these spots are not 
perfectly in the \/5 periodicity but that there is a 
stress-induced mismatch between the oxide and 
the substrate, as has previously been reported (/3). 
This mismatch is readily visible in HESXRD, but 
it is not possible to infer its presence with LEED 
because of resolution limitations. 

Atomic positions on the surface can be de- 
termined from the intensity variations along the 
CTRs and superstructure rods. For this, the in- 


A ° B 


tensities are translated into structure factors, which, 
after taking standard correction factors into account 
(/0), are proportional to the square root of the 
diffraction intensity. These are then compared to 
calculated structure factors. Here, we used the 
atomic positions determined from quantitative 
LEED and density functional theory (/3) and 
calculated corresponding structure factors using 
the software package ANA-ROD (/4). We found 
good agreement between the experimental and 
calculated structure factors (Fig. 2, E and F, for 
one CTR and one superstructure rod and fig. S6 
for all available rods). In addition to establishing 
that our HESXRD method is suitable for struc- 
tural determination, the good agreement between 
the experimental and calculated structure factors 
also shows that the structure found in our study is 
similar to the one revealed by LEED. However, 
our data were recorded during steady-state CO 
oxidation conditions, whereas the structure in- 
vestigated with LEED was prepared in pure O, 
under UHV conditions (/3). 

In addition to facilitating the collection of 
data for structural determinations, probing a larger 
part of reciprocal space simultaneously makes 
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HESXRD an ideal tool for time-resolved mea- 
surements. We illustrate these capabilities by fol- 
lowing how the surface structure of Pd(100), during 
catalytic CO oxidation, responds to a change in 
the gas stoichiometry (/5). At a constant sample 
temperature of 575 K, we changed from a gas 
mixture of 6 Torr CO and 1.5 Torr O2 (with Ar 
as carrier gas) to 6 Torr CO and 3 Torr Op; (i.e., 
the same mixture as used above) and monitored 
the partial pressures of the reactant and product 
gases with mass spectrometry (MS). The time 
dependence of the O2, CO, and CO; signals is 
shown in Fig. 3A. The most important features 
are found at 100 s, when the O, pressure was 
increased, and after about 550 s, when there is a 
sudden increase in the CO, production. Diffrac- 
tion data were collected continuously while keeping 
the sample at a fixed position set to reveal the 
presence and structure of any oxide on the sur- 
face. Diffraction images were recorded every 0.5 s, 
resulting in a movie of 2000 frames (movie $3). In 
Fig. 3C, we show three snapshots from the movie at 
the times indicated by the lines I to III in Fig. 3A, 
chosen such that the images are representative for 
the diffraction at the time (1) before, (II) during, and 
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Fig. 2. In situ HESXRD data from the surface 
oxide on Pd(100) measured during CO oxidation 
in a flow of 6 Torr CO and 3 Torr O2 at a sample 
temperature of 600 K. (A) Top and side view of the 
0.4 0.6 relevant oxygen-induced (V5 by \/5)R27° surface oxide 

h (RLU) structure. The structure is an O-Pd-O trilayer correspond- 
ing to one PdO(101) plane (12, 13). (B) All images collected during the rotational scan combined into a single image, in which the CTRs and superlattice rods 
are indicated. The way these images are combined enhances background noise, such as the powder diffraction rings originating from polycrystalline defects 
in the crystal. (C) In-plane view (hk-plane at / = 0.5) of the angular range measured in the current experiment. The Pd CTRs (squares) and surface oxide 
superlattice rods (circles) are indicated. (D) Magnification of reciprocal space showing the superlattice reflections. The reflections are not in the middle of the 
red circle, directly revealing the mismatch between the PdO(101) and the Pd(100) substrate. (E and F) Extracted (dots) CTRs and superlattice rods from the 
rotational images as seen in (B), and calculated structure factors (full lines). All the detector images are shown individually in movie $2. 
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Fig. 3. Time-resolved measurement of the response of the surface structure 
to a change in reaction gas stoichiometry from 6 Torr CO and 1.5 Torr O2 to 
6 Torr CO and 3 Torr O, at a sample temperature of 575 K. (A) Mass spec- 
trometry data showing the 02, CO, and COz signals during the experiment. 
(B) Integrated diffraction intensities in the areas shown by the white boxes in 
panel (C-I), corresponding to the surface and bulk oxides respectively. (C) 
HESXRD snap shots during the experiment from the times indicated by | to III 
in (A). (D) Schematics of the surface structure at the times I, II, and III in Fig. 
3, A to C, illustrating the gradual formation of an oxide layer. 


(III) after the increase in catalytic activity at around 
550 s. At time I, no oxide can be detected, which 
means that the surface is metallic. The surface is at 
this stage predominantly covered by CO hindering 
the dissociative adsorption of O,, and the reaction is 
so-called CO self-poisoned (see Fig. 3D) (6). At 
time II, at which the catalytic activity increases 
considerably, two rods corresponding to the /5 
surface oxide appear. As the reaction proceeds, 
the oxide structure continues to develop, and, 
when the new steady state is reached at time 
IL, there is an increased intensity at an elevated 
/ value (shown by the white arrow in Fig. 3C), 
revealing the onset of the growth of a several- 
layer-thick PdO film with a (101)-oriented surface 
(17, 18). To enable a more direct comparison be- 
tween the MS and diffraction data, Fig. 3B shows 
the integrated intensities inside the white boxes 
shown in panel I of Fig. 3C. There is a strong 
correlation between the formation of the surface 
oxide and the increased activity, and the activity 
remains high during the development of the thicker 
oxide. Hence, this proof-of-principle study strong- 
ly indicates that the activity for CO oxidation is 
particularly high over the PdO(101) surface. Al- 
though our best fit indicates that the PdO(101) 
film covers close to 100% of the surface, we can- 
not exclude minor areas, such as defects, affecting 
the activity, a subject for intense discussion in 
recent literature (/9, 20). 

Our results demonstrate how high-energy 
X-rays Open up new opportunities in the use of 
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surface x-ray diffraction. Thanks to the high en- 
ergy, the scattering angles decrease and the dif- 
fraction pattern can be collected on a stationary 
2D detector, which simplifies and speeds up the 
data acquisition process considerably. There are, 
however, some drawbacks with the high energy. 
The small critical angle for total reflection makes 
the sample alignment very sensitive, for instance, 
to changes of the sample temperature. This can be 
overcome with an automatic feedback system for 
sample alignment. Further, the scattered intensity, 
as well as the sensitivity of the detectors, drops 
with increasing photon energy. This is, to a large 
extent, made up for by the improved source and 
high-energy optics performance at third-generation 
synchrotron radiation sources. The high penetra- 
tion ability at high energies also enables the use 
of complex sample environments, crucial for in 
situ experiments, and structure determination of 
buried or electrochemical solid-liquid interfa- 
ces, without relying on thin-film geometries. 
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Not Dead Yet 


Morgan T. Page* and Susan E. Hough 


The extent to which ongoing seismicity in intraplate regions represents long-lived aftershock activity 
is unclear. We examined historical and instrumental seismicity in the New Madrid central U.S. region 
to determine whether present-day seismicity is composed predominantly of aftershocks of the 
1811-1812 earthquake sequence. High aftershock productivity is required both to match the 
observation of multiple mainshocks and to explain the modern level of activity as aftershocks; 
synthetic sequences consistent with these observations substantially overpredict the number of 
events of magnitude > 6 that were observed in the past 200 years. Our results imply that ongoing 
background seismicity in the New Madrid region is driven by ongoing strain accrual processes and 
that, despite low deformation rates, seismic activity in the zone is not decaying with time. 


plate boundaries, as evidenced by earth- 

quakes that occur in stable continental re- 
gions. Intraplate earthquakes, which are related 
to the internal deformation of plates rather than 
motion at plate boundaries, can be large and dam- 
aging, as with the 2001 Bhuj earthquake (/). In 
this work, we study the 1811-1812 New Madrid 
sequence, which is of paramount importance for 
understanding intraplate seismogenesis and for 
probabilistic seismic hazard assessment in the 
central and eastern United States and other mid- 
continental regions. The sequence included four 
events that were widely felt throughout the cen- 
tral and eastern United States, conventionally 
regarded as three primary mainshocks and the 
large dawn aftershock following the first main- 
shock. Magnitude estimates for these events have 
varied widely, from a low of magnitude (M4) ~ 7 
for the largest mainshocks (2) to values over 8 
in magnitude (3). 

Aftershocks of the 1811-1812 sequence have 
been considered in two ways. Several studies 
have used archival accounts of large aftershocks 
and/or tallies of felt earthquakes to estimate mag- 
nitudes for large aftershocks and consider the over- 
all magnitude distribution of early aftershocks 
[e.g., (4, 5)]. Two studies have considered the 
long-term rate of seismicity in the New Madrid 
Seismic Zone (NMSZ) and concluded that it is 
well characterized as a long-lived aftershock se- 
quence (6, 7). It is important to note, however, 
that these latter two studies do not show a fit, 
from 1811 to present, to traditional Omori decay 
(8, 9). Such direct evidence has been observed 
for the classic long-lived aftershock sequence 
following the 1891 Nobi earthquake, for which 
an Omori decay can be seen for 100 years (/0). 
In the New Madrid case, however, a direct fit is 
not possible given uncertainties in the early New 
Madrid catalog. In this study, we reconsider the 
long-lived aftershock hypothesis using rigorous 
tests assuming an Epidemic Type Aftershock Se- 
quence (ETAS) model (//). ETAS modeling allows 
us to determine probabilities of observing robust 
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features of the New Madrid catalog, should the 
long-lived aftershock hypothesis be true. 

The ETAS model, developed on the premise 
that all earthquakes potentially trigger their own 
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aftershocks, successfully explains the empirical 
Omori decay law, which, so far as is known, uni- 
versally describes the temporal decay of aftershocks. 
The ETAS model explains observed foreshock 
rates and multiplets (72) and has been shown to 
accurately characterize seismicity, including both 
short- and long-term aftershock sequences [e.g., 
(/3)], and is now a widely used short-term earth- 
quake clustering model (/4). The model has been 
used to characterize and forecast seismicity rates 
in a wide range of tectonic environments, including 
intraplate regions and regions characterized by 
swarmy activity (75, 76). In this work, we use ETAS 
modeling in an attempt to generate synthetic cat- 
alogs that match well-constrained features of the 
New Madrid earthquake sequence (see materials 
and methods in the supplementary materials). 

To test the long-lived aftershock hypothesis, 
we identified three robust observational constraints 
that are not dependent on particular contentious 
magnitude values. Our first imposed constraint 
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Fig. 1. Seismicity in the New Madrid region (CEUS catalog, 1800—2008, M > 4). Note that the 


early catalog is not complete to M4. 
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is that the sequence included four principal events 
of comparable magnitude, separated by no more 
than 0.7 magnitude units. This is based on the 
range in event magnitudes inferred by different 
studies (2, 3, 17). Although the absolute mag- 
nitudes of these earthquakes remain a subject for 
debate, the relative magnitudes are much more 
reliably determined. Analysis of prehistoric sand- 
blows in the NMSZ shows that protracted se- 
quences, with multiple large mainshocks, are 
apparently the norm for this region (/8). 

The second constraint is on the recent rate of 
moderate-sized (M > 4) earthquakes. Because 
using different catalogs and box sizes produce 
different estimates, we used the most conserv- 
ative estimate of three / > 4 earthquakes over 
10 years (Fig. 1), taken from the Central and 
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Eastern United States Seismic Source Character- 
ization (CEUS-SSC) catalog (/9) (see materials 
and methods). 

The third constraint is the number of moder- 
ate (M > 6) events in the NMSZ after the initial 
cluster in the first year. The CEUS-SSC catalog 
(79) includes two such events, the 1843 Marked 
Tree, Arkansas, and 1895 Charleston, Missouri, 
earthquakes, both with preferred magnitudes of 
6.0. Although a recent reinterpretation of macro- 
seismic effects of the 1843 earthquake (20) es- 
timates a lower preferred magnitude of 5.4, we 
assume, for conservatism, that the sequence 
produced no more than two M > 6 late events 
(see materials and methods). 

We generated synthetic ETAS catalogs, search- 
ing for a single set of subcritical, direct Omori 
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parameters that matched the three robust obser- 
vational constraints described above. The frac- 
tion of stochastic catalogs that are consistent with 
both early clustering behavior and recent seismic- 
ity in the New Madrid region are shown in Fig. 2, 
A and B, respectively. These two constraints re- 
duce the possible ETAS phase space to a small 
region (Fig. 2C). Synthetic catalogs produced in 
this region of the ETAS phase space are very 
productive both early and late in the sequence. We 
find that synthetic sequences that are active enough 
to match observed New Madrid-style early clus- 
tering behavior and current seismicity rates con- 
tain many more M > 6 events at intermediate times 
than have been observed (table S1). At 95% 
confidence, no set of direct Omori parameters is 
consistent with all three of our constraints: early 
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Fig. 2. Regions of ETAS parameter space consistent with New Madrid 
behavior. The unphysical, supercritical regime (see materials and methods) is 
shown in red. (A) ETAS simulations within the subcritical regime are sampled 
at the black points; colors show a linear interpolation of the fraction of 
synthetic sequences for which the four largest shocks in the first 2 months are 
within 0.7 magnitude units of each other, as was seen in the New Madrid 
sequence. Above the black line (which theoretically is smooth but has small 
irregularities due to sampling error), at least 5% of synthetic sequences are 
consistent with New Madrid clustering behavior; below this line, the early 
behavior is less productive than observations. The red dot shows average 
California parameters (25) for reference. (B) The fraction of synthetic sequences 
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that have a late (200 years post-mainshock) aftershock rate that matches 
current New Madrid seismicity rates. (C) The parameter space consistent with 
both early clustering and current seismicity rates is confined to a small region; 
we sample sequences at the points shown and find that sequences with pa- 
rameters in this region typically produce a much higher rate of 6 earthquakes 
after the first year than that observed. (D) The maximum fraction, over all 
mainshock magnitudes, that is consistent with early clustering, current seis- 
micity rates, and the rate of M > 6 earthquakes after the first year, linearly 
interpolated between sampling points. Although some variation in this plot is 
due to sampling error, all points have been sampled sufficiently to determine 
that the fraction is less than 5%, at 95% confidence (see table $1). 
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clustering, current seismicity rates, and the rate of 
M= 6 events after the first year (Fig. 2D). Among 
sequences sampled that were consistent with New 
Madrid early clustering behavior and current seis- 
micity rates, the mean number of M > 6 earth- 
quakes from 1 year to 200 years post-mainshock 
was 135. At best, at some points in ETAS phase 
space ~1.7% of the sequences are consistent with 
our criteria. Results using a stricter criteria that 
includes the observation that no M = 6 earth- 
quakes occurred in the region in the past 100 years 
(table S1) show that we can reject the long-lived 
aftershock hypothesis at even higher confidence. 

Based on our statistical analysis, the hypoth- 
esis that current seismicity in the New Madrid 
region is primarily composed of aftershocks from 
the 1811-1812 sequence fails. This is because a 
sequence active enough at late times to produce 
the seismicity rates observed today and active 
enough at early times to produce the short-term 
clustering observed in the first few months would 
be highly likely to produce too many aftershocks 
in the intermediate times. If current seismicity in 
the New Madrid region is not composed pre- 
dominantly of aftershocks, there must be con- 
tinuing strain accrual. This is in agreement with 
recent work finding nonzero strain measure- 
ments in the region that are consistent with on- 
going interseismic slip of about 4 mm/year (27), 
in contrast to earlier studies [e.g., (22)]. The spa- 
tial distribution of the stress pattern driven by 


this model would be generally consistent with the 
stress change caused by an earthquake on the 
Reelfoot fault. This could explain how ongoing 
microseismicity is not part of an aftershock se- 
quence but is still consistent with the predicted 
stress change associated with the 1811-1812 se- 
quence (23). If ongoing microseismicity does re- 
sult from ongoing strain accrual, this suggests that 
the region, along with the neighboring Wabash 
Valley where nonzero strain has also been observed 
(24), will continue to be a source of hazard. 
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Evolutionarily Dynamic Alternative 
Splicing of GPR56 Regulates Regional 
Cerebral Cortical Patterning 
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The human neocortex has numerous specialized functional areas whose formation is poorly 
understood. Here, we describe a 15—base pair deletion mutation in a regulatory element of GPR56 
that selectively disrupts human cortex surrounding the Sylvian fissure bilaterally including “Broca’s 
area,” the primary language area, by disrupting regional GPR56 expression and blocking RFX 
transcription factor binding. GPR56 encodes a heterotrimeric guanine nucleotide—binding 
protein (G protein)—coupled receptor required for normal cortical development and is expressed 
in cortical progenitor cells. GPR56 expression levels regulate progenitor proliferation. GPR56 splice 
forms are highly variable between mice and humans, and the regulatory element of gyrencephalic 
mammals directs restricted lateral cortical expression. Our data reveal a mechanism by which 
control of GPR56 expression pattern by multiple alternative promoters can influence stem cell 
proliferation, gyral patterning, and, potentially, neocortex evolution. 


Ithough most mammals have elaborate 
and species-specific patterns of folds 
(“gyri”) in the neocortex, the genetic and 
evolutionary mechanisms of cortical gyrification 
are poorly understood (/—3). Abnormal gyrifica- 
tion, such as polymicrogyria (too many small 
gyri), invariably signals abnormal cortical devel- 


opment, so regional disorders of gyrification are 
of particular interest, because they highlight mech- 
anisms specific to cortical regions. The human 
cortex contains dozens of cortical regions spe- 
cialized for distinct functions—such as language, 
hearing, and sensation—yet it is unsolved how 
these cortical regions form and how human cor- 


tical regions evolved from those of prehuman 
ancestors. 

Examination of >1000 individuals with gyral 
abnormalities identified five individuals from three 
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families (one Turkish and two Irish-American) 
with strikingly restricted polymicrogyria limited 
to the cortex surrounding the Sylvian fissure 
(Fig. 1, A and B; fig. S1; and movies S1 and S82), 
which suggests a rare, but genetically distinctive, 
condition. Affected individuals suffered intellec- 
tual and language difficulty, as well as refractory 
seizures (onset 7 months to 10 years), but had no 
motor disability (table S1). Magnetic resonance 
imaging (MRI) and quantitative gyral analysis 
showed abnormal inferior and middle gyri in pre- 
frontal and motor cortex, with mildly affected 
temporal lobes. Broca’s area—the “motor center 
for speech” (4)—in the left hemisphere and the 
corresponding areas of the right hemisphere were 
most severely affected. Affected neocortical sur- 
face showed abnormally numerous, small gyral- 
like folds that fused in coarse, irregular patterns, 
with abnormal and highly irregular white matter 
protrusions, consistent with polymicrogyria (5, 6), 
along with widening of the Sylvian fissure (Fig. 
1A and fig. S1B). 

Genome-wide analysis identified a single linked 
locus on chromosome 16q12.2-21 (Fig. 1C) con- 
taining the GPR56 gene, which, when mutated in 
its coding region, leads to polymicrogyria of the 
entire neocortex, as well as cerebellar and white 
matter abnormalities (7—9). As we found no mu- 


Control 


A 


tations in the exons of GPR56, we sequenced 
38 conserved non-exonic elements (table S2), in 
one of which we identified a small deletion in all 
five individuals. The mutated element normally 
contains two copies of a 15—base pair (bp) tan- 
dem repeat, but all affected individuals have a 
homozygous deletion of one 15-bp repeat (Fig. 1, 
E and F). The deletion is heterozygous in parents 
of the affected individuals, who manifest no ob- 
vious clinical signs, and is absent from thousands 
of control chromosomes in the Single-Nucleotide 
Polymorphism Database and 1000 Genomes 
database. The two Irish-American families carry 
the mutation on the same chromosomal haplo- 
type, which reflects a common founder. It is note- 
worthy that the Turkish family carries the same 
deletion on a distinct haplotype, which indicates 
that the mutation arose independently (Fig. 1D). 
The element is located <150 bp upstream of 
the transcriptional start site of noncoding exon 1m 
(elm) of GPR56, which suggests that it may regu- 
late elm expression as a cis-regulatory element. 
GPRS5S6 has at least 17 alternative transcription 
start sites, each beginning from a different non- 
coding first exon; all of the start sites are pre- 
dicted to drive transcription of mRNAs whose 
coding sequence starts from exon 3 (Fig. 2A and 
fig. S2A) and all of which encode the same 
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GPRS56 protein (/0, //). The diverse noncoding 
first exons have distinct expression profiles, with 
elm being the most robustly transcribed first exon 
in fetal human brain but with several other alter- 
native transcripts also expressed in fetal and adult 
brain (Fig. 2A and fig. S2, B to D). 

To confirm that the 15-bp deletion disrupts 
perisylvian GPR56 expression, we generated trans- 
genic mice with the 23-kb human GPRS56 up- 
stream region driving green fluorescent protein 
(GFP) expression. The 23-kb region encompasses 
16 of the 17 transcription start sites containing 
elm and ends before the translation start codon 
(Fig. 2A). This construct drives GFP expression 
in the entire central nervous system, including 
neocortex, and recapitulates the location and rel- 
ative amount of expression of endogenous mouse 
GPRS6 protein (Fig. 2B and fig. S3). In contrast, 
the 23-kb construct containing the 15-bp deletion 
drives expression in medial, but not lateral, cor- 
tex or lateral ganglionic eminence (Fig. 2B). These 
data suggest that the cis-regulatory element up- 
stream of elm drives GPR56 expression in the 
perisylvian and lateral cortex, whereas disruption 
of the element, with consequent impairment of elm 
expression, causes the perisylvian malformation. 

To elucidate how the 15-bp deletion in the 
cis-regulatory element disrupts elm expression, 
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taining GPR56. LOD, logarithm of the odds ratio for linkage. (D) The mutation arose inde- 
pendently in the Turkish and the Irish-American families. Haplotype mapping shows that 
pedigree 1 (1-V:1, 1-V:2) and pedigree 2 (2-VI:1, 2-VI:2) are unrelated. Homozygous single- 
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causes perisylvian polymicrogyria. 2-V:2 stands for a heterozygous parent and 2-VI:2 for an 
affected individual from pedigree 2. 
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we performed yeast one-hybrid (Y 1H) screening 
of a mouse forebrain cDNA library with the hu- 
man cis-regulatory element as bait and obtained 
multiple yeast colonies encoding members of 
the regulatory factor X (Rfx) transcription factor 
family (Fig. 2C) (72). RFX1 and RFX3 bind the 
normal element in vitro, with binding decreased 
60 to 70% by the 15-bp deletion (Fig. 2D). Chro- 
matin immunoprecipitation sequencing confirmed 
RFX3 binding to the element (fig. S4) (73). RFX1 
and GPR56 colocalize in germinal zones of fetal 
human brain (Fig. 2E). Dominant-negative RFX 
abrogates normal, but not mutant, elm promoter 
activity on embryonic day 13.5 (E13.5) in mouse 
cortical cultures (Fig. 2F). Furthermore, genetic 
ablation of Rfc4 decreases Gpr56 expression in 
developing mouse brain (/4). REX and GPR56 ex- 
pression patterns are correlated (fig. S5, A and B) 


(15), with RFX3 and RFX7 most prominent in 
human ventrolateral prefrontal cortex, the region 
affected in perisylvian polymicrogyria (Fig. 2G), 
which suggests that multiple RFX proteins regu- 
late the element. 

GPR56 encodes an adhesion heterotrimeric 
guanine nucleotide—binding protein (G protein) 
coupled receptor that is highly expressed in cor- 
tical progenitors (7, 76) and binds extracellular 
matrix proteins (/7). Loss of GPR56 disrupts ra- 
dial glia and causes breaches in the pial basement 
membrane, through which some neurons over- 
migrate (9, 76). However, even where the pia is 
intact, we found that neocortical thickness and 
organization are irregular, with occasional thin 
regions in Gpr56 knockout mice (Fig. 3A). Post- 
mortem analysis of a human with biallelic GPR56 
coding mutations showed a very thin cortex, which 


suggested potential roles of GPR56 in neuro- 
genesis as well (9). GPR56 protein is most highly 
expressed in progenitors in the ventricular and 
subventricular zones during neurogenesis in 
mice (/6, 18). GPR56 expression in develop- 
ing human and marmoset neocortex is highest 
in the ventricular zone, as well as in the outer 
subventricular zone, which is expanded in mam- 
mals with larger brains (2) (Fig. 3B and fig. 
S5, C and D). 

Impairment and overexpression of GPR56 
show that its expression regulates proliferation. 
Gpr56 knockout mice show fewer phosphohis- 
tone H3 (PH3)-positive mitotic progenitor cells 
and TBR2-positive intermediate progenitors than 
wild-type mice in the neocortex at E14.5. Con- 
versely, mice carrying a transgene that directs 
overexpression of human GPR56 show increased 
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throughout the transgenic mouse neocortex (E14.5), which mirrors 
endogenous GPR56 protein expression. The 15-bp deletion elim- 


inates GFP expression from lateral cortex but preserves medial F 
cortex expression, consistent with lesions observed by brain MRI. 30 
(fig. S1) (n = 4 to 6 embryos with identical patterns per construct). 52 
Scale bar, 200 um. (C) Y1H screening reveals Rfx transcription §& £00 
factor binding to the cis-regulatory element. See text for details. fr = 
(D) The mutation decreases RFX binding to the cis-regulatory ps 010 
element in vitro. (E) RFX1 and GPR56 are colocalized in a human = 5 
fetal brain 19 weeks after conception. Higher magnification of the =z ~ ° 


outer subventricular zone is shown (right). v, ventricular zone; is, 
inner subventricular zone; os, outer subventricular zone; and i, R 
intermediate zone. Scale bars, 100 um (left) and 10 um (right). 

(F) Dominant-negative RFX (white bars) abrogates normal e1m promoter activity. 
Black bars, GFP control. (G) Each RFX gene has distinct expression patterns in the 
fetal human brain. Each number means the corresponding RFX isoform. RFX3 and 
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RFX7 are enriched in regions affected by perisylvian polymicrogyria (green boxes). 
pfc, prefrontal cortex; opfc, orbital pfc; dlpfc, dorsolateral pfc, mpfc, medial pfc; 
vipfc, ventrolateral pfc; ms, motor-sensory cortex. *P < 0.001, t test. 
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mitotic progenitor cells and intermediate pro- 
genitor cells (Fig. 3, C and D). In utero electro- 
poration (at E13.5 with analysis 48 hours later 
at E15.5) of a plasmid encoding GPR56 (as well 
as GFP, to mark the cells) caused cells to persist in 
proliferative zones compared with cells express- 
ing GFP alone (Fig. 3E). Changes in the number 
of intermediate progenitors in transgenic and knock- 
out mice may be secondary to changes in the ra- 
dial progenitor cells that generate them or might 
reflect a direct role of GPR56 in intermediate pro- 
genitors but is consistent with a report that loss 
of TBR2 (EOMES) also causes polymicrogyria in 
humans (/9). 

The cis-regulatory element upstream of GPR56 
elm is found in genomes of all placental mam- 
mals, but not monotremes, marsupials, or non- 
mammals, which suggests that it emerged after 
placental and nonplacental mammals diverged 
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85 to 100 million years ago (fig. S7B). The cis- 
regulatory element sequence is only found at the 
elm locus in GPR56 but not elsewhere in the 
genome. E1m itself shows homology at its 3’ end 
to a long interspersed nuclear element (LINE)— 
4/RTE, a family of retrotransposons active in 
early mammals after divergence from marsupials 
(20). Another noncoding GPR56 exon (exon 2), 
present only in primates, derives from a primate- 
specific A/u insertion (fig. S7B). In contrast to the 
>17 alternative first exons in humans, we found 
only five noncoding first exons in the mouse 
Gpr56 gene (Fig. 4A and fig. S7A) (0, 17). Thus, 
GPR56 acquired many noncoding upstream 
exons and generated alternative splice forms with 
distinct expression patterns (fig. S2, B and D), 
in the lineage leading to humans. Transposable 
element insertion played a role in generating this 
diversity. 
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To test directly whether evolutionary changes 
in GPR56’s alternative splice forms have func- 
tional effects, we generated transgenic mice in 
which the B-galactosidase (B-gal) gene is driven 
by a minimal 300-bp elm promoter from human, 
mouse, marmoset, dolphin, and cat (Fig. 2A and 
fig. S6A). The mouse elm promoter drives B-gal 
expression broadly in the nervous system in di- 
verse cell types, which suggests that this simple 
300-bp elm promoter is sufficient to recapitulate 
major features of the endogenous mouse GPR56 
expression (/6—/8) (Fig. 4B and fig. S6B). In con- 
trast, the corresponding human elm promoter 
has a variety of deletions and single-nucleotide 
variants, relative to the mouse sequence (fig. S6A), 
and drives much more limited expression in 
rostral-lateral cortex (Fig. 4B and fig. S6B). Weak 
lateral cortical expression is visible in embryos 
carrying the mouse elm promoter:f-gal transgene, 


% cells 


GPR56 Control GPR56 


Control 


Fig. 3. GPR56 regulates neuroprogenitor proliferation. (A) In Gpr56é 
knockout mice, neurons overmigrate through breached pial basement mem- 
brane (arrowheads) or undermigrate (arrows) forming irregular cortical 
layers, as shown by immunostaining of Cux1, an upper layer (II to IV) marker 
(p9). Thin cortex is occasionally observed (asterisks). (B) GPR56 is highly 
expressed in human ventricular zone and outer subventricular zone at 12 
to 17 weeks of gestation (GW), which suggests roles in neuroprogenitors. v, 
ventricular zone; s, subventricular zone; is, inner subventricular zone; os, 
outer subventricular zone; i, intermediate zone; c, cortical plate; and m, mar- 
ginal zone. (C to D) Human GPR56 transgenic (Tg) mice have significantly 
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more mitotic (PH3+) neuroprogenitor cells and intermediate progenitor 
(TBR2+) cells than wild-type (WT). In contrast, Gpr56 knockout (KO) mice 
have significantly fewer mitotic cells and intermediate progenitors than 
WT (E13.5 to E14.5). (n = 7 mice per genotype; *P < 0.005; **P < 0.001; 
paired t test). (E) The cells that are in utero electroporated (from E13.5 to 
£15.5) with human GPR56-IRES-GFP [either side of the internal ribosome 
entry site (IRES), GFP expressing] persist in the germinal zones longer that 
the GFP control cells. Red, TBR2; blue, Hoechst. (n = 11 mouse embryos per 
construct; *P < 0.0001; chi-squared test). Scale bars, 500 um (A) and 100 um 
(B) to (E). 
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Fig. 4. GPR56 gene evolution. (A) The mouse Gpr56 locus has only 5 tran- 
scription start sites compared with 17 in human. (B) A 300-bp human GPR56 elm 
promoter sequence containing the cis-regulatory element (Fig. 2A) directs B-gal 
expression to lateral cortex in mice (arrow), whereas the orthologous mouse e1c(m) 


5 kb 


promoter directs more widespread expression. Orthologous promoters from other 


mammals with larger cortical sizes, abundant gyri, or both (marmoset, dolphin, and Cc 
cat) drive relatively limited expression patterns generally similar to human (n = 3 to 

10 embryos with identical patterns per promoter). Scale bar, 2 mm. (C) Human, 
marmoset, dolphin, and cat brains are gyrencephalic or near-gyrencephalic with a 
Sylvian fissure (arrowhead). Mouse brain lacks both gyri and Sylvian fissure. Scale 

bar, 1 cm. Images from the University of Wisconsin and Michigan State Comparative 
Mammalian Brain Collections and/or the National Museum of Health and Medicine 

are reproduced with permission from Brainmuseum.org. 


which suggests that, in humans, additional ele- 
ments besides the 300-bp elm promoter region 
are required to drive the full complement of 
GPRS56 expression. Elm promoters from mar- 
moset, dolphin, and cat drive expression patterns 
generally similar to human. The shared expres- 
sion patterns in the four mammals that have a 
Sylvian fissure (Fig. 4C) suggest that elaboration 
of GPR56 noncoding regulation is consistent with 
the larger number of noncoding first exons in 
gyrencephalic mammals and humans. Elaboration 
of additional alternative splice forms provides a 
mechanism for potentially independent evolution 
of these multiple forms. 

Our studies show that levels of GPR56 control 
proliferation of progenitors in the neocortex. Loss 
of GPR56 expression impairs neurogenesis, and 
overexpression enhances proliferation and pro- 
genitor number. Selective GPR56 loss causes 
strikingly regional defects of cortical develop- 
ment. GPR56 likely influences progenitor prolif- 
eration by stabilizing the basal process of radial 
neuroepithelial progenitors, because (i) GPR56 
protein localizes to the basal processes of radial 
neuroepithelial cells (16); (i1) GPR56 binds ex- 
tracellular matrix proteins in the pial basement 
membrane, such as collagen type III (/7) and 
tetraspanins, which bind integrins as well (2/); 
(iii) GPR56 is required for normal attachment 
of basal processes to the pial basement mem- 
brane in mice (/6); and (iv) basal processes regu- 
late progenitor proliferation via integrin signaling 
(22, 23), and GPR56 interacts genetically with 
3B, integrin (24). The elaboration of the GPR56 
locus in gyrencephalic mammals, and especially 
humans, to produce many alternative splice forms 
with diverse expression patterns presents GPR56 


as a key target that could influence the dramatic 
changes in shape and folding that characterize 
the forebrain of many mammals. Elaboration and 
specialization of alternative transcripts with dis- 
tinct transcription start sites is an evolutionary 
mechanism that has been difficult to study be- 
cause of the lack of comprehensive catalogs of 
RNA splice forms, but continued RNA sequencing 
studies may soon provide the opportunity to as- 
sess its importance systematically. 


References and Notes 

1. P. Rakic, Nat. Rev. Neurosci. 10, 724-735 (2009). 

2. J. H. Lui, D. V. Hansen, A. R. Kriegstein, Cell 146, 18-36 
(2011). 

3. K. Zilles, N. Palomero-Gallagher, K. Amunts, Trends 
Neurosci. 36, 275-284 (2013). 

4. K. Amunts et al., PLOS Biol. 8, e1000489 (2010). 

. J. A. Golden, B. N. Harding, Nat. Rev. Neurol 6, 471-472 

(2010). 

. A. J. Barkovich, Neuroradiology 52, 479-487 (2010). 

. X. Piao et al., Science 303, 2033-2036 (2004). 

. X. Piao et al., Ann. Neurol. 58, 680-687 (2005). 

. N. Bahi-Buisson et al., Brain 133, 3194-3209 

(2010). 

10. D. Thierry-Mieg, J. Thierry-Mieg, Genome Biol. 7 
(suppl. 1), $12-S14 (2006). 

11. Y. Suzuki, R. Yamashita, K. Nakai, S. Sugano, 
Nucleic Acids Res. 30, 328-331 (2002). 

12. S. J. Ansley et al., Nature 425, 628-633 (2003). 

13. A. Jolma et al., Genome Res. 20, 861-873 
(2010). 

14. D. Zhang et al., J. Neurochem. 98, 860-875 
(2006). 

15. H. J. Kang et al., Nature 478, 483-489 (2011). 

16. S. Li et al., J. Neurosci. 28, 5817-5826 (2008). 

17. R. Luo et al., Proc. Natl. Acad. Sci. U.S.A. 108, 
12925-12930 (2011). 

18. S.J. Jeong, R. Luo, S. Li, N. Strokes, X. Piao, J. Comp. 
Neurol. 520, 2930-2940 (2012). 

19. L. Baala et al., Nat. Genet. 39, 454-456 (2007). 


w 


Oo OND 


Bh BD 


Marmoset 


Dolphin 


< 


Marmoset Dolphin 


Mouse 


20. T. S. Mikkelsen et al., Nature 447, 167-177 (2007). 
21. L. Xu, R. O. Hynes, Cell Cycle 6, 160-165 (2007). 
22. R. Radakovits, C. S. Barros, R. Belvindrah, B. Patton, 
U. Miller, J. Neurosci. 29, 7694-7705 (2009). 
23. S. A. Fietz et al., Nat. Neurosci. 13, 690-699 
(2010). 
24. S. J. Jeong et al., PLOS ONE 8, e68781 (2013). 


Acknowledgments: Research performed on samples of 
human origin was conducted according to protocols 
approved by participating institutions, including Boston 
Children’s Hospital and Beth Israel Deaconess Medical 
Center. The human embryonic and fetal material was 
provided by the Joint Medical Research Council (grant 

no. GO700089)—Wellcome Trust (grant no. GRO82557) 
Human Developmental Biology Resource (www.hdbr.org) 

and the National Institute of Child Health and Human 
Development, NIH, Brain and Tissue Bank at the University 
of Maryland (contract no. HHSN275200900011C, reference 
no. NO1-HD-9-0011). Gpr56 knockout mice are from 
Genentech. This work was supported by the Strategic Research 
Program for Brain Sciences and from the Ministry of 
Education, Culture, Sports, Science and Technology (MEXT) 
Japan (H.O.); Funding Program for World-Leading Innovative 
R&D on Science and Technology (FIRST Program) (H.O.); 
U01MH081896 from National Institute of Mental Health, 
NIH (N.S.); 2ROINSO35129 from National Institute of 
Neurological Disorders and Stroke, NIH (C.A.W.); and The 
Paul G. Allen Family Foundation (C.A.W.). Additional funding 
support listed in supplementary materials. C.A.W. is an 
investigator of the Howard Hughes Medical Institute. Gpr56 
knockout mice are available from Genentech subject to a 
Material Transfer Agreement. 


Supplementary Materials 
www.sciencemag.org/content/343/6172/764/suppVDC1 
Materials and Methods 

Supplementary Text 

Figs. $1 to $7 

Tables $1 and $2 

Movies $1 to S2 

References (25-40) 


7 August 2013; accepted 17 December 2013 
10.1126/science.1244392 


14 FEBRUARY 2014 VOL 343 SCIENCE www.sciencemag.org 


Origin and Spread of de Novo Genes in 
Drosophila melanogaster Populations 


Li Zhao,?* Perot Saelao, Corbin D. Jones,? David J. Begun?* 


Comparative genomic analyses have revealed that genes may arise from ancestrally nongenic 
sequence. However, the origin and spread of these de novo genes within populations remain 
obscure. We identified 142 segregating and 106 fixed testis-expressed de novo genes in a 
population sample of Drosophila melanogaster. These genes appear to derive primarily from 
ancestral intergenic, unexpressed open reading frames, with natural selection playing a significant 
role in their spread. These results reveal a heretofore unappreciated dynamism of gene content. 


Ithough the vast majority of genes present 
A any species descend from a gene present 
in an ancestor, recent analyses suggest 

that some genes originate from ancestrally non- 
genic sequences (/—3). Evidence for these “de 
novo” genes has generally derived from a combina- 
tion of phylogenetic and genomic/transcriptomic 
analyses that reveal evidence of lineage- or species- 
specific transcripts associated with nongenic 
orthologous sequences in sister species. De novo 
genes, which were first identified in Drosophila 
(J—3), have also been identified in humans, ro- 
dents, rice, and yeast (4-9). In Drosophila, de novo 
genes tend to be specifically expressed in tissues 
associated with male reproduction (2, 10), which 
suggests that sexual or gametic selection may be 
important (/—3, 9), although other functional roles 
may evolve (/0, 1/). Because previous studies of 
de novo gene evolution used comparative rather 
than population genetic approaches, the earliest 
steps in de novo gene origination remain myste- 
rious. Here, we used population genomic and 
transcriptomic data from Drosophila melanogas- 
ter and its close relatives to investigate the origin 
and spread of de novo genes within populations. 
Illumina paired-end RNA sequencing (RNA-seq) 
and de novo and reference-guided assembly and 
alignment were used to characterize the testis 
transcriptome of six previously sequenced inbred 
Raleigh (RAL) D. melanogaster strains (12); an 
average of 65 million paired-end reads were 
produced for each strain (table S1). We inferred 
(73) the presence of 142 polymorphic de novo 
candidate genes that are expressed in at least one 
RAL strain but are not known on the basis of 
publicly available data from D. melanogaster. 
The median number of segregating de novo genes 
carried per strain was 49. Reverse transcription 
polymerase chain reaction (RT-PCR) and 5’ and 3’ 
rapid amplification of cDNA ends (RACE) in a 
subset of genes supported inferences from RNA-seq 
analysis (table $2). These candidate polymorphic 
genes correspond to unique, intergenic sequence 
in the D. melanogaster reference sequence (table 
S3), are alignable to unique orthologous regions 
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in the D. simulans and D. yakuba reference se- 
quences, and show no significant BLASTP hits to 
the NCBI nr (nonredundant) protein database. The 
candidate genes exhibited expression neither in 
testis RNA-seq data from three D. simulans and 
two D. yakuba strains (table S1 and fig. S1) nor 
in whole male and female RNA-seq data from 59 
D. simulans strains (13). None of the candidates 
showed significant expression in whole females 
from the same D. melanogaster strains used for 
testis RNA-seq (table S4). These data support the 
hypothesis that the 142 candidates are new, 
male-specific, de novo genes still segregating in 
D. melanogaster. Expression levels of the can- 
didate genes greatly exceed levels of background 
transcription in intergenic sequence (fig. S2) (13); 
several additional attributes of these genes, as de- 
scribed below, support the hypothesis that the 
observed transcripts are biologically meaningful. 

Segregating de novo genes were moderately 
expressed (Fig. 1A and Table 1), but their ex- 
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pression was significantly lower than that of 
annotated male-biased genes (/3) (Table 1) or 
annotated genes (table S6). We observed no en- 
richment of polymorphic de novo genes near 
annotated male-biased genes and no significant 
correlation between the strand (+/-) of polymor- 
phic de novo genes and that of their immediate 
annotated neighbors [y* test, P > 0.1 (table S5 
and fig. $3), supported by simulations (/3)]. There 
was a marginally significant underrepresentation 
of X chromosome segregating de novo genes 
relative to annotated male-biased genes (10 genes 
are X-linked; ¢ exact test, P= 0.01; Fig. 1B). This 
result stands in contrast to speculation based on a 
small sample of older, fixed de novo genes (2, 3) 
that de novo male-biased genes are overrepre- 
sented on the X chromosome. 

As expected, de novo genes were significant- 
ly shorter and simpler than annotated genes and 
annotated male-biased genes (Table 1 and table 
S6). This pattern is likely due mostly to the larger 
proportion of polymorphic de novo genes that are 
single-exon (57.0%) compared to the proportion 
of annotated single-exon (table S6) or single- 
exon male-biased genes (Table 1) (/3). Among 
the 61 multi-exon de novo genes, the majority of 
splice events (98%) were associated with canon- 
ical sites; rare noncanonical splice sites were found 
in four genes as minor isoform splice events, 
which were similar to those previously observed 
in D. melanogaster (14). Alternative splicing was 
observed in 20 of the 61 multi-exon segregating 
de novo genes (table S7), with conserved reading 
frames across alternative isoforms. Genes asso- 
ciated with alternative splicing generally exhibited 
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Fig. 1. Basic properties of segregating de novo 
genes. (A) Expression estimates of segregating de novo 
genes, fixed de novo genes, all annotated genes, and 


annotated male-biased genes in D. melanogaster. 


(B) Simulation of de novo gene locations. The boxplot 


for each chromosome is the simulated number of 


genes from intergenic regions. The black dot is the 
observed number. The X chromosome is the only 
chromosome arm that deviates from the expected 
number of genes (¢ test, P = 0.01). (C) Pie chart of 
segregating de novo gene frequency. 
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multiple isoforms across strains that expressed 
the corresponding gene, with no evidence of ge- 
netic variation for alternative splice use. 

Of 142 polymorphic genes, 134 (94%) had a 
minimum open reading frame (ORF) of at least 
150 base pairs (bp) and were classified as po- 
tentially coding. To determine the likelihood that 
the high proportion of genes harboring long ORFs 
occurred by chance, we investigated the coding 
potential of intergenic regions in the reference 
sequence, focusing on single-exon ORFs. We ob- 
served that 59.9% of random 800-bp intergenic 
sequences were associated with a >150-bp single- 
exon ORF, whereas 97.5% of the observed single- 
exon de novo genes were associated with such an 
ORF (P < 0.01). Moreover, the mean length of 
single-exon de novo gene ORFs was substantial- 
ly greater than that expected in random intergenic 
sequence (P < 0.05). These observations further 
support the idea that the observed transcripts are 
unlikely to be explained simply as random noise. 
The eight polymorphic de novo genes that did 
not satisfy our arbitrary minimum ORF criterion 
were autosomal and slightly smaller (mean tran- 
script length = 743 bp) than ORF-containing poly- 
morphic genes. Because orthologous sequences 
from expressing and non-expressing D. mela- 
nogaster lines have similar coding potential, most 
segregating de novo genes are likely to have 
resulted from the recruitment of small, preexist- 
ing, unexpressed ORFs (/). For D. simulans and 
D. yakuba orthologous sequences, 70% and 45%, 
respectively, contained ORFs similar to those ob- 
served for segregating genes in D. melanogaster. 
Of the 134 predicted de novo proteins, 41.8% 
may be intrinsically unfolded (fig. S4, A to D) 
and 50% of these have predicted binding regions 
(fig. S4E); both observations are consistent with 
potential biological function (/5). For putative 
protein-coding genes, the average 5’ and 3’ un- 
translated region (5'UTR and 3’/UTR) lengths— 
248 bp and 364 bp, respectively—were slightly 
shorter than the average lengths for annotated 
D. melanogaster genes but were slightly longer 
than the averages for annotated male-biased genes 
(Table 1). The incidence of the two major poly- 
adenylation signals (AAUAAA and AUUAAA) 
in or near the putative 3’UTRs of segregating de 
novo genes was similar to, but slightly lower than, 
the incidence in the whole genome (table S8). 
Overall, polymorphic de novo genes have struc- 
tural organization consistent with small protein- 
coding genes in the species. 

Segregating de novo genes either were ex- 
pressed at a relatively high level in expressing 
strains or showed almost no evidence of expres- 
sion in other strains. Hartigan’s dip test on tran- 
script abundance estimates rejected unimodality 
for 134 of 142 genes and was consistent with bi- 
modal expression across lines for most genes. We 
used a cutoff of two fragments per kilobase of 
exon per million fragments mapped (FPKM > 2) 
for inferring expression of a transcript in a line 
(16) to determine the proportion of strains, from 
0.17 (1/6) to 1.0 (6/6) expressing each transcript. 
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Because no candidates show expression in the 
reference sequence strain, the genes expressed in 
all six RAL strains are considered to be poly- 
morphic in the species. More than half the genes 
(55%) were not rare in the Raleigh sample, as they 
were expressed in at least two of the six RAL 
strains (Fig. 1C); 29.5% were definitely common, 
being expressed in three or more strains, which is 
inconsistent with mutation-selection balance. We 
observed 106 unannotated male-specific transcripts 
expressed in all six strains and in the reference 
strain (table S9) but not in the outgroup strains. 
The corresponding “fixed” de novo genes were 
not included in downstream analyses relating to 
segregating genes. 

We extracted the 100 bp upstream and 50 bp 
downstream of the inferred transcription start site 
(TSS) from the genome sequences of the express- 
ing strains for each of the 61 multi-exon genes. 
MDscan identified and clustered motifs in these 
flanking sequences; sequence logos were then 
generated. We observed four common consensus 
sequence motifs (8 or 10 bp; Fig. 2A), each of 
which was found associated with roughly half the 
segregating de novo genes (/3) (table S10). In 
total, 371 annotated male-biased genes (23.3%) 
were also associated with at least one of these 
motifs, which suggests that the de novo genes share 
regulatory features with known male-biased genes. 
We identified 67 annotated male-biased genes 
(table S11) that have two or more motifs in the 5’ 
regions. However, GO (Gene Ontology) enrich- 
ment analysis (fig. S5) provided no insight into 
the possible functions of de novo genes. 

These data support the hypothesis that de novo 
gene expression is influenced by cis-acting var- 
iants in the regions corresponding to the 5’ flank- 
ing regions of expressing chromosomes. In the 
simplest case that de novo gene expression is 
due to a single noncoding nucleotide change, one 
would predict an excess of fixed differences be- 
tween expressing and non-expressing chromo- 
somes in flanking regions compared to random 
samples of intergenic sequences. We focused on the 
32 genes expressed in more than two strains and 
for which our genetic analysis (/3) supported cis- 


acting variation driving de novo gene expression. 
Of these genes, 31.2% exhibited a fixed, derived 
single-nucleotide polymorphism (SNP) within 
500 bp upstream of the TSS, whereas only 8.43% 
of simulated “genes” (intergenic regions defined by 
harboring derived SNPs with the same frequency 
distribution as the 32 observed genes) exhibited a 
fixed SNP in the comparable 5’ region (P< 0.01). 
More generally, divergence between expressing and 
non-expressing chromosomes for these 500-bp re- 
gions was significantly greater than divergence in 
simulated data (P = 0.048); this finding supports 
the hypothesis that cis-regulatory changes play a 
role in de novo gene origination. 

Under this hypothesis, segregating genes should 
be associated with allele-specific expression. We 
thus measured allelic imbalance (/7, /8) in the 
testis in a set of three unique F, genotypes created 
by crossing the six RAL strains (table S1) (/3). 
For the 59 autosomal genes for which one parent 
expressed the gene and the other did not, ex- 
pression patterns in the heterozygote for 51 genes 
were explained completely by cis-acting varia- 
tion (i.e., allelic imbalance was complete); 7 genes 
showed evidence of regulation by both cis-acting 
and trans-acting factors. Only 1 of the 59 genes 
showed no evidence of allelic imbalance, consist- 
ent with expression driven solely by trans-acting 
variation (table S12). More generally, for genes 
expressed in both parents, the expression of al- 
leles in F; was consistent with expression levels 
in each parental line (table S13), further support- 
ing the importance of cis-acting expression vat- 
iants. The roughly bimodal expression patterns 
and the dominant role of cis effects support the 
idea that the proportion of lines expressing a gene 
provides an estimate of its population frequency. 

One population genetic explanation for poly- 
morphic de novo genes is that singleton genes 
(45% of genes) are primarily deleterious and that 
higher-frequency genes are primarily neutral. If the 
deleterious nature of de novo genes were due to 
the cost of transcription or translation, or from 
toxic interactions of the resulting RNAs or pro- 
tems with other molecules, then lower-frequency 
genes should be more abundantly expressed and 


Table 1. Properties of segregating and fixed de novo genes and comparison with annotated 
male-biased genes in D. melanogaster. Wilcoxon test, ***P < 0.001, **P < 0.01, *P < 0.05; ns, not 
significant. For segregating de novo genes, P values are comparisons of segregating versus fixed genes 
and segregating versus male-biased genes. For fixed de novo genes, P values are comparisons of fixed de 
novo genes versus male-biased genes. Male-biased genes are as defined in (13). All estimates are 


medians, except for exon number (mean). 


Segregating de novo genes 


Fixed de novo genes _ Male-biased genes 


Number 142 
Transcript length (bp) 801 ns/*** 
Exon length (bp) 518*/*** 
Exon number 1.47*/*** 
Intron length (bp) O1t (et 
5’UTR length (bp) 248*/*** 
3’UTR length (bp) 364 ns/*** 
Single-exon gene (%) 57*/** 


Expression (FPKM) T1888 


106 1595 
1013** 1184 
512*** 355 
1.79% 2.37 
70.5*** 77 

267.5*** 170 
337*"* 267 
48.1*** 35.8 
19.96*** 66.54 
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longer than higher-frequency genes. However, con- 
trary to this expectation, lower-frequency genes were 
expressed at a lower level, were shorter, and were 
less complex than higher-frequency genes (table 
S6) (73). The different properties of rare versus 
common de novo genes (Table 2) (73) supports the 
idea that de novo genes having certain properties 
(e.g., greater expression, longer transcripts, more 
exons) are more likely to spread under selection. 

We investigated the role of directional selec- 
tion on polymorphic de novo genes by determin- 
ing whether they are associated with reduced 


nucleotide diversity (J9, 20). For each de novo 
gene expressed in at least two strains, we com- 
pared the nucleotide diversity (x) for expressed 
sequence (strains) versus non-expressed orthologous 
sequence (non-expressing strains) and compared 
the observed differences to a frequency-corrected 
expected value from resampling of intergenic se- 
quence from the six RAL strains (/3). For 46 of 
65 genes, = was lower in the expressed lines (mean = 
0.0060) than in the non-expressed lines (mean = 
0.0092) and exhibited a roughly 38% reduction 
relative to non-expressed orthologous sequence 
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Fig. 2. Regulation and population genetics of segregating de novo genes. (A) Potential cis- 
regulatory elements. The most common shared 8- and 10-bp consensus motifs in 5’ flanking regions are 
listed. From top to bottom, 34, 29, 25, and 30 multiple-exon genes show these motifs. (B) Nucleotide 
diversity () for de novo genes and flanking regions. The red line is the ratio of x-expressing lines to x—non- 
expressing lines; the green line shows expected values from resampling of intergenic DNA conditional on 
the same derived allele frequency distribution as the observed de novo genes. 1 estimates for 5’ and 3’ 
flanking regions of genes were incremented in 5-kb windows. (C) A gene (Gene_X_141) that may have 
experienced a hard selective sweep. Gray box denotes expressing lines. The TSS region contains a derived 
allele fixed in expressing strains and absent in non-expressing strains; flanking regions are homozygous in 
expressing strains. (D) A gene (Gene_3L_079) showing no evidence of hard sweep. Gray box denotes 
expressing lines. The TSS region includes a derived allele fixed in expressing lines, but the flanking regions 
of expressing chromosomes retain nucleotide variation. 


Table 2. Properties of segregating genes differ across frequency classes. Wilcoxon test, ***P < 
0.001, **P < 0.01, *P < 0.05; P values are comparison of singleton versus nonsingleton genes and 
singleton versus high-frequency (= 3/6) genes. FPKM and transcript length estimates are medians; exon 
numbers are means. 


Singleton Nonsingleton High-frequency 
FPKM Ee Suki! alah 9.91 12.31 
Transcript length (bp) Tashi" 869 1312 
Exon number 1.38*/** 1.53 1.81 
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over the 65 genes (Wilcoxon test, P=0.003). For 
30 genes, m was significantly lower in the ex- 
pressed lines (Wilcoxon test, P < 0.05). The re- 
gion of reduced heterozygosity near expressed 
sequences is on the scale of 5 kb or less (Fig. 2B 
and fig. S6), which is contrary to the expectation of 
strong selection on new mutations (/9) but con- 
sistent with weaker selection (20) or soft selective 
sweeps (2/) (Fig. 2, C and D). Polymorphic de novo 
genes were significantly (Wilcoxon test, P< 0.001) 
(73) more likely to be differentially expressed be- 
tween populations (29 of 142, or 17%) relative to 
annotated genes (4.5%) and male-biased genes 
(6.3%), which also supports the idea that selection 
may play a role in their spread. 

We used the Hudson-Kreitman-A guade—like 
(HKAI) test statistic (22, 23) to compare the 
heterozygosity/divergence ratio for genomic re- 
gions associated with fixed de novo genes to that 
observed for appropriately sampled intergenic 
regions (13, 20). The HKAI for fixed regions 
(mean = —0.48) was significantly smaller than 
that expected for comparable random intergenic 
regions (mean = 0.12; Wilcoxon test, P < 0.001). 
Moreover, regions corresponding to fixed genes 
associated with higher expression (FPKM > 10) 
exhibited a smaller HKAI statistic relative to 
regions associated with fixed genes having lower 
(FPKM < 10) expression (HKAI = —0.33 versus 
—0.86; Wilcoxon test, P< 0.001). These observa- 
tions also support the hypothesis that de novo genes 
have been influenced by directional selection. 

Our analyses suggest that there are many 
polymorphic de novo male-specific genes in 
D. melanogaster populations, likely recruited by 
selection primarily from ancestral, unexpressed 
ORFs (fig. S7). Given the small number of geno- 
types investigated for a single tissue and our 
strict filtering criteria, we have likely substantial- 
ly underestimated the number of polymorphic 
de novo genes. Our results also suggest the exis- 
tence of many more fixed de novo D. melanogaster 
genes than previously inferred (2, 4, 70), which 
supports the idea that a substantial genetic com- 
ponent of male reproductive biology in this spe- 
cies remains completely unexplored. 

More generally, our results suggest that im- 
portant attributes of an organism’s biology cannot 
be accurately represented or investigated without 
knowledge of de novo gene variation within spe- 
cies. In the absence of gene loss, de novo gene 
gain would lead to a long-term increase in gene 
number. Although our analyses are consistent with 
substantial numbers of polymorphic gene losses, 
we observed no population genetic evidence that 
losses result from directional selection (/3). Thus, 
de novo genes may often spread under selection, 
while gene loss may occur primarily as a result of 
drift associated with loss of ancestral gene func- 
tion. However, important details of such processes 
remain obscure, and much additional work is re- 
quired to clarify the dynamics, biochemical and 
genetic properties, and phenotypic effects of young 
de novo genes and the processes underlying gene 
loss in natural populations. 
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Crude Oil Impairs Cardiac 
Excitation-Contraction Coupling in Fish 


Fabien Brette,? Ben Machado,’ Caroline Cros,? John P. Incardona,” 


Nathaniel L. Scholz,? Barbara A. Block?* 


Crude oil is known to disrupt cardiac function in fish embryos. Large oil spills, such as the 
Deepwater Horizon (DWH) disaster that occurred in 2010 in the Gulf of Mexico, could severely 
affect fish at impacted spawning sites. The physiological mechanisms underlying such potential 
cardiotoxic effects remain unclear. Here, we show that crude oil samples collected from the DWH 
spill prolonged the action potential of isolated cardiomyocytes from juvenile bluefin and yellowfin 
tunas, through the blocking of the delayed rectifier potassium current (/x,). Crude oil exposure also 
decreased calcium current (/c,) and calcium cycling, which disrupted excitation-contraction 
coupling in cardiomyocytes. Our findings demonstrate a cardiotoxic mechanism by which crude oil 
affects the regulation of cellular excitability, with implications for life-threatening arrhythmias 


in vertebrates. 


tude oil is a complex chemical mixture 
( containing hydrocarbons (aliphatic and 

aromatic) and other dissolved-phase or- 
ganic compounds. Toxicity research on crude oil 
constituents has focused mainly on polycyclic 
aromatic hydrocarbons (PAHs) (/, 2), pervasive 
environmental contaminants that are also found 
in coal tar, creosote, air pollution, and land-based 
runoff. In the aftermath of oil spills, PAHs can 
persist for many years in marine habitats and 
thereby create pathways for lingering biological 
exposure and associated adverse effects. 

PAH toxicity is structure-dependent, and the 
carcinogenic, mutagenic, and teratogenic proper- 
ties of many individual PAHs are known (3, 4). 
Developing fish are particularly vulnerable to 
dissolved PAHs in the range of ~100 parts per 
billion (ppb or ug/liter) down to <10 pg/liter. 
Consequently, PAH toxicity to fish early life stages 
is an important contributor to both acute and long- 
term impacts of environmental disasters (2, 5). 
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Numerous studies on crude oils and PAHs, par- 
ticularly in the aftermath of the Exxon Valdez spill, 
have described embryonic heart failure, brady- 
cardia, arrhythmias, reduction of contractility, 
and a syndrome of cardiogenic fluid accumu- 
lation (edema) in exposed fish embryos (6, 7). 
These severe effects are lethal to embryos and 
larval fishes (8—/0) and could be due to atrio- 
ventricular conduction block (//). 

Despite recent progress using zebrafish and 
other experimental models to study PAH car- 
diotoxicity (12), the mechanisms that underpin 
the physiological effects on cardiac function and 
changes in cardiac morphology during develop- 
ment are not known. The Deepwater Horizon 
(DWH) oil spill released >4 million barrels of 
crude oil during the peak spawning months for 
Atlantic bluefin tuna (7hunnus thynnus) in 2010. 
This large and long-lived species reaches a mass 
of 650 kg over a life span of 35 years or more 
(/3), and the Gulf of Mexico population of blue- 
fin tuna is severely depleted (/4). Electronic- 
tagging data confirm that bluefin tuna spawn 
in the vicinity of the DWH spill, which indicates 
that bluefin tuna embryos, larvae, juveniles, and 
adults were likely exposed to crude oil-derived 
PAHs (/4). Many other Gulf of Mexico pelagics 


may have spawned in oiled habitats, including 
yellowfin tuna, dolphin fish, blue marlin, and 
swordfish (/5). 

To more precisely define the mechanisms of 
crude oil cardiotoxicity and to evaluate the poten- 
tial vulnerability of eggs, larvae, and juveniles in 
the vicinity of the DWH spill, we assessed the 
impact of field-collected DWH oil samples on 
in vitro cardiomyocyte preparations dissociated 
from the hearts of bluefin tuna (7° orientalis) and 
yellowfin tuna (7. albacares). Juvenile tunas were 
caught at sea and held in captivity at the Tuna 
Research Conservation Center and the Monterey 
Bay Aquarium (/6). 

The cardiotoxic effects of four distinct envi- 
ronmental samples of MC252 crude oil were as- 
sessed as water-accommodated fractions (WAFs) 
prepared in Ringer solution for marine fish (/6). 
Oil samples were collected under chain of custody 
during the DWH spill response effort. The samples 
included riser “source” oil (sample 072610-03), riser 
oil that was “artificially weathered” by heating at 
90° to 105°C (sample 072610-W-A), and two skimmed 
oil samples: “slick A” (sample CTC02404-02), 
collected 29 July 2010, and “slick B” (sample 
GU2888-A0719-OE701), collected 19 July 2010 
by the U.S. Coast Guard cutter Juniper. High- 
energy WAFs were prepared in a commercial 
blender that dispersed oil droplets to mimic re- 
lease conditions at the MC252 well head (/6). 
As expected from previous studies (//, 12), the 
total sum (>°) of PAHs declined in WAFs from 
source oil to the surface-weathered samples, owing 
to loss of naphthalenes, whereas the total concen- 
trations of three-ringed PAHs (e.g., phenanthrenes) 
increased proportionately (fig. S1 and table S1). 
PAH concentrations were in a range expected to 
cause cardiotoxicity in intact embryos and consist- 
ent with the );PAHs measured in some surface 
water samples during the DWH oil spill (up to 
84 g/liter) (6, 17). WAFs in Ringer solution 
were perfused over freshly dissociated, isolated 
tuna cardiomyocytes, and we assessed the effects 
of these oil-containing solutions on excitation- 
contraction (EC) coupling using electrophysio- 
logical and Ca**-imaging techniques. 
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Patch-clamp recordings revealed a strong ef- 
fect of DWH source oil and weathered oil on 
bluefin and yellowfin tuna cardiomyocytes’ ac- 
tion potential duration (APD) (Fig. 1 and fig. S3). 
A concentration-dependent lengthening of the 
APD waveform was observed in both tuna spe- 
cies. APD at 90% repolarization (i.e., equivalent 
to the QT interval on an electrocardiogram) was 
significantly increased across all four oil sam- 
ples at )PAH concentrations ranging from 4 to 
61 g/liter (table S1). The source and weathered 
oils significantly decreased the APD at 10% re- 
polarization (APD; ) (Fig. 1). WAF exposures did 
not influence other action potential parameters, 
such as resting membrane potential and action 
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potential amplitude (figs. S2 and $3). This sug- 
gests that /;, the background current responsible 
for resting membrane potential, and JN, the cur- 
rent responsible for the upstroke of the action 
potential, are not modified by crude oil. All four 
oil samples significantly increased the time for 
repolarization from APD39 to APDog. This in- 
crease in triangulation (Fig. 1, I to L, and fig. S3, 
E and F) is a strong predictor of fatal cardiac ar- 
rhythmia (/8). Pharmacological agents that cause 
a cardiac repolarization disorder by lengthening 
cardiomyocyte APD, as well as congenital muta- 
tions of hERG (human ether-d-go-go-related gene 
or KCNH2) channels—the mammalian homolog 
to the fish delayed rectifier potasstum current 
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Fig. 1. Effect of oil WAFs on action potential characteristics from bluefin tuna ventricular car- 
diomyocytes. (A to D) Action potentials in controls (black) and with ascending concentrations of source 
oil (blue traces), artificially weathered (orange traces), slick A (green traces), and slick B (red traces) 
WAFs. (E to H) APD (expressed as a percentage of control) at 10, 50, and 90% repolarization in control 
(black bars) and with ascending concentrations of source (n = 9), artificially weathered (n = 8), slick A (n = 7), 
and slick B (n = 7). (I to L) Action potential triangulation (expressed as a percentage of control; 
calculated as APDog — APD39) in control (black bars) and with ascending concentrations of source, 
artificially weathered, slick A, and slick B. (E) to (L): Means + SEM. *P < 0.05. 
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(Ixy) (19)}—are known to cause or aggravate ven- 
tricular arrhythmias, which can result in torsade 
de pointes and/or sudden death (20). 

Overall, the effects of MC252 oil WAFs on 
cardiomyocyte action potentials in bluefin and 
yellowfin tunas were similar. However, the car- 
diotoxic potency of each oil sample correlated 
closely with the concentrations of three-ringed 
PAHs rather than total )’PAHs (fig. S4), as evi- 
denced in particular by the extensively weathered 
slick B sample (fig. S1 and table $1), which in- 
creased both APD and triangulation without af- 
fecting resting membrane potential or amplitude 
(Fig. 1 and figs. S2 and S3). In some cardiomyo- 
cytes, WAFs caused unstable action potentials 
after depolarizations (fig. SSB). Such arrhythmias 
were not observed among ventricular cells in 
Ringer solution over an equivalent recording du- 
ration (fig. SSA). 

The functional effects of PAHs on fish car- 
diac rhythmicity suggest that components of crude 
oil interfere with EC coupling, which links elec- 
trical excitation to contraction in cardiomyocytes 
(21, 22). Depolarization of the cardiac sarcolemmal 
membrane opens voltage-gated ion channels, in- 
cluding L-type Ca”* channels, which results in 
Ca" entry into the cytosol. This Ca”" transient 
triggers the release of additional Ca”* from inter- 
nal stores [sarcoplasmic reticulum (SR)] by means 
ofa Ca’'-induced Ca** release mechanism (CICR) 
(23-25). The rise in intracellular Ca** activates 
the contractile machinery within the cardiomyo- 
cyte. Critical for action potential repolarization 
are the opening and closing of voltage-gated Na‘, 
G a and K* channels, which renew the EC cou- 
pling process at every heartbeat. The repolarization 
of the tuna cardiomyocyte action potential in- 
volves a delicate balance of inward and outward 
ionic currents. Thus, cardiac action potential pro- 
longation could be due to a decrease in outward 
current, an increase in inward current, or both. To 
distinguish between these possibilities, we used 
electrophysiological analyses (voltage clamp) to 
investigate the influence of slick B (as a represent- 
ative oil sample of all four WAFs) on the major 
outward currents (/,) and inward calcium current 
(Ica) in isolated cardiomyocytes. 

We characterized the rapid component of 
the delayed potassium current (/,) in the bluefin 
tuna using electrophysiological and pharmaco- 
logical techniques as previously described (i.e., 
E-403 1-sensitive current (26)]. In the bluefin tuna 
ventricular cardiomyocyte, /x, amplitude and tail 
current were reduced in a concentration-dependent 
manner in response to exposures to slick B WAF 
(Fig. 2, A to C) with a half maximum inhibitory 
concentration (ICs) of 51 + 6 ug PAHs per liter 
and a Hill coefficient of 1.19 + 0.11. Perfusion 
with surface oil (slick A) also decreased Jy, in 
bluefin tuna ventricular cardiomyocytes (fig. S6) 
with a similar ICs) (53 + 31 g/liter) and Hill 
coefficient (1.16 + 0.43). In yellowfin tuna, ex- 
posure of ventricular cardiomyocytes to slick B 
WAF also significantly decreased J,, tail currents 
ICso = 61 + 12 g/liter, Hill coefficient = 0.84 + 
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0.11) (fig. S7). Taken together, these data show 
that DWH crude oils significantly decrease J, 
currents in both species. The effect of slick B 
(22 ug PAHs per liter) on the /,, current- 
voltage (/-V) relation is shown in Fig. 2D. WAF 
perfusion reduced /x, amplitudes across all volt- 
ages without affecting the shape of the /-V curve 
(Fig. 2E). In addition, /x, tail currents were de- 
creased at all voltages without shifting the curve 
(Fig. 2F). Bluefin tuna ventricular cardiomyocytes 
exposed to source oil (61 g/liter) and yellowfin 
tuna ventricular cardiomyocytes exposed to slick 
B (22 ug/liter) showed comparable blockade of 
Ix, (figs. S8 and $9). Thus, dissolved constituents 
of MC252 crude oil do not affect the voltage- 
dependent (gating) properties of the K* chan- 
nel but rather inhibit outward conductance in 
the open state, most likely by blocking the K* 
channel pore. This mechanism would be consist- 
ent with the observed prolongation of the cardio- 
myocyte action potential. To confirm this, we 
perfused tuna ventricular cardiomyocytes with 
the specific Jx, blocker, E-4031 (2 uM in Ringer 
solution). As anticipated, E4031 significantly pro- 
longed APDoo, consistent with /x, shaping the 
repolarization of bluefin and yellowfin tuna cardio- 
myocytes (fig. S10). 

Ic, also plays a critical role in cardiomyocyte 
APD (27, 28). Exposure to the weathered slick B 
surface sample significantly decreased the am- 
plitude of Jc, (Fig. 3, A to C) in a concentration- 
dependent manner, with an ICs of 36 + 7 ug 
PAHs per liter and a Hill coefficient of 0.76 + 
0.13 for bluefin tuna cardiomyocytes. Note that 
slick B WAF also slowed the inactivation decay 
of Ic, (Fig. 3, D and E) and thereby allowed more 
Ca?’ entry during depolarization (27). As indi- 
cated by quantification of Ca** entry, there was 
a small, but not significant, decrease in charge 
passing through the channel during the square 
pulse (Fig. 3, F and G). /-V relations (Fig. 3H) 
revealed an inhibitory effect of slick B WAF on 
Ic, across all voltages, with a slight influence on 
the shape of the -V curve (Fig. 31, top), which 
suggested a change in the voltage-dependent prop- 
erties of Ca’* channels. Perfusion with the slick B 
WAF shifted the activation curve toward more 
hyperpolarized potentials (by ~7 mV) (Fig. 31, 
bottom), which allowed more Ca”* entry at neg- 
ative potentials. As with bluefin tuna, slick B 
WAF (22 ug )’PAHs per liter) also inhibited Jc, 
in ventricular cardiomyocytes of yellowfin tuna 
(ICs9 = 46 + 5 pg/liter, Hill coefficient = 1.01 + 
0.09) (fig. S11). 

To further explore the influence of DWH oil 
on the voltage-dependent properties of cardiac 
Ca?* channels, Jc, was measured in bluefin car- 
diomyocytes with Ba** as a charge carrier. In 
the absence of Ca’*-dependent inactivation, 
the channel inactivates primarily via voltage- 
dependent processes (27). Similar to the effects 
on I¢,, slick B WAF significantly decreased the 
amplitude of /g, but did not slow the inactiva- 
tion of the current (fig. $12). This suggests that 
the observed change in /c, inactivation rate in re- 


sponse to oil is Ca”*-dependent and not voltage- 
dependent. The decrease in /., amplitude and 
slowing of inactivation might have countervail- 
ing effects on Ca** entry during the plateau phase 
of the action potential, as measured from action 
potential waveforms in response to physiological 
pulses (29). 

The entry of Ca*' via Jc, during action po- 
tentials was similar among controls and ventric- 
ular cardiomyocytes perfused with slick B WAF 
(22 wg )PAHs per liter) for bluefin (fig. $13) and 
yellowfin tunas (fig. S14). Overall, the absence 
of an effect of crude oil on Ca** entry during a 
physiological pulse is attributable to (i) an in- 
crease in APD, allowing more time for Ca”* 
entry; (i1) a leftward shift in the activation prop- 
erties of Ca”* channels; and (iii) a slowing of 
Ica inactivation. Although our findings are not 
sufficient to explain action potential prolonga- 
tion, they show that DWH crude oil significantly 
decreases J, amplitude in cardiomyocytes of 
tunas. L-type Ca”* channels play a key role in ini- 
tiating the critical CICR from SR internal stores; 
thus, the next series of experiments were designed 
to measure whole-cell Ca** cycling in isolated 
cardiomyocytes exposed to DWH oils. 


Intracellular Ca** transients in bluefin tuna car- 
diomyocytes were recorded using confocal 
microscopy and Ca”*-sensitive dye (Fluo-4). Ex- 
posures to each oil sample (source, artificially 
weathered source, slick A, and slick B at 30, 
18, 7, and 11 ug PAHs per liter, respectively) 
significantly decreased the Ca?* transient am- 
plitudes and slowed the decay of the Ca*" tran- 
sients in bluefin tuna ventricular cardiomyocytes 
(Fig. 4). This reduction in Ca” transient am- 
plitudes would decrease contractility and would 
reduce cardiac output at the scale of the whole 
heart. A diminished cytosolic Ca”* transient could 
be a consequence of reduced extracellular Ca** 
influx, a smaller Ca”* release from SR internal 
stores, or both (24, 30). 

The direct measurements of Ca" transients in 
cardiomyocytes indicate there may be inhibitory 
effects of oil on SR Ca*" release and/or reuptake. 
To investigate the possible SR sites of interaction, 
cardiomyocytes were exposed to pharmaco- 
logical inhibitors of SR Ca** release channels 
(5 uM ryanodine) and Ca** adenosine triphos- 
phatase (ATPase) pumps (2 uM thapsigargin) 
for at least 30 min before exposures to WAFs. 
Under pharmacological blockade, the four dis- 
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tinct crude oil samples had no significant effect 
on the amplitude of the cytosolic Ca** transient 
(Fig. 4D). This indicates that the oil-induced 
decrease in Ca** transient amplitude observed 
in the absence of blockers is due to a disruption 
of SR Ca’ release and/or reuptake from inter- 
nal stores. However, these toxic effects of oil on 
intracellular Ca?” cycling were partially offset by 


an additional influx of Jc, via L-type Ca”* chan- 
nels during action potential prolongation, consist- 
ent with the /c, results in Fig. 3. 

Our experimental findings provide a mecha- 
nistic underpinning for cardiac-specific physio- 
logical defects previously reported and reinforce 
the findings that crude oil has deleterious phys- 
iological impacts on fish hearts (/0). Our results 
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with crude oil are similar to the physiological 
effects of the antimalarial drug halofantrine, a 
chemical with structural similarities to three- 
ringed PAHs that causes K* channel inhibition 
and cardiac arrhythmias (3/). Our results indi- 
cate compounds in DWH oil produce a cardio- 
toxic mechanism that have direct effects on ion 
channels involved in the EC coupling and cardiac 
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contractility of cardiomyocytes. These pathways 
in cardiac muscle cells are highly conserved across 
all vertebrates, which explains the common, ca- 
nonical crude oil toxicity syndrome observed in a 
diversity of fish species from habitats that range 
from tropical freshwater (zebrafish) to boreal ma- 
rine (herring). 

In conclusion, the oil-induced disruption of 
cardiomyocyte repolarization via K* channel 
blockade and sarcolemmal and SR Ca”" cycling 
should call attention to a previously underap- 
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Massively Parallel Single-Cell 
RNA-Seg for Marker-Free Decomposition 
of Tissues into Cell Types 
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In multicellular organisms, biological function emerges when heterogeneous cell types form 
complex organs. Nevertheless, dissection of tissues into mixtures of cellular subpopulations is 
currently challenging. We introduce an automated massively parallel single-cell RNA sequencing 
(RNA-seq) approach for analyzing in vivo transcriptional states in thousands of single cells. 
Combined with unsupervised classification algorithms, this facilitates ab initio cell-type 


characterization of splenic tissues. Modeling single-cell transcriptional states in dendritic cells and 
additional hematopoietic cell types uncovers rich cell-type heterogeneity and gene-modules activity 
in steady state and after pathogen activation. Cellular diversity is thereby approached through 
inference of variable and dynamic pathway activity rather than a fixed preprogrammed cell-type 
hierarchy. These data demonstrate single-cell RNA-seq as an effective tool for comprehensive 
cellular decomposition of complex tissues. 


nderstanding the heterogeneous and defined cell types that are used to dissect cell 
stochastic nature of multicellular tissues — populations along developmental and functional 
is currently approached through a priori _ hierarchies (/—3). This methodology heavily relies 


on enumeration of cell types and their precise 
definition, which can be controversial (4—7) and 
is based in many cases on indirect association of 
function with cell-surface markers (5—8). Perhaps 
the best understood model for cellular differen- 
tiation and diversification is the hematopoietic sys- 
tem. The developmental tree branching from 
hematopoietic stem cells toward distinct immu- 
nological functions was carefully worked out 
through many years of study, and effective cell- 
surface markers are available to quantify and sort 
the major hematopoietic cell types. Even in this 
well-explored system, however, it is becoming 
increasingly difficult to explain modern genome- 
wide and in vivo data with refined cell types’ 
hierarchy and functions that extend beyond the 
classical myeloid and lymphoid cell types. For 
example, dendritic cells (DCs) are antigen-presenting 
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cells that were originally characterized through 
their morphology (9) but are now understood to 
represent a highly heterogeneous group (/0) with 
multiple functions, regulatory circuits, and phe- 
notypes (6, 7, 9). Despite considerable efforts and 
progress by use of the marker-based approach, 
much of the known functional heterogeneity within 
the DC group is not truly compatible with any of 
the DC subclassification schemes (6, 7, /7). Such 
lack of definitive models for cell types and states 
is common in many fields of biology. 


An attractive alternative to marker-based cel- 
lular dissection of complex tissues is to characterize 
in vivo cell-type compositions through unsupervised 
sampling and modeling of transcriptional states 
in single cells. This natural approach was so far 
difficult to implement because of many technical 
limitations that are being progressively alleviated 
with the advent of single-cell RNA sequencing 
(RNA-seq) (12-20). Sampling and sequencing 
RNA from dozens of single cells was recently 
used to estimate stochastic transcriptional varia- 


20 cells (1%) 


tion in stationary cultured cells (74) or during a 
dynamic process (/2—/4, 16, 19). An unsupervised 
framework for dissecting transcriptional hetero- 
geneity within complex tissues may therefore be 
envisioned, provided that many thousands of cells 
can be assayed routinely by using single-cell RNA- 
seq and that data from such experiments can be 
normalized and modeled effectively even when 
cells represent highly diverse cell types and states. 

We developed an automated massively par- 
allel RNA single-cell sequencing framework 
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Fig. 1. Massively parallel single-cell RNA-seq. (A) Distribution of mapped reads per cell in a multiplexed 1536-cell experiment. (B) Mean and variance in 
mRNA (blue) and spike-in controls (red). (C) Mean mRNA counts in replicated pooled population of homogeneous (FACS-sorted) pDCs. 


Fig. 2. Single-cell dissection of immune cell 
types. (A) Color-coded correlation matrix of single- 
cell mRNA profiles. Groups of strongly correlated 
cells that are used to initialize a probabilistic mix- 
ture model are numbered and marked with white 
frames. (B) Circular a posteriori projection (CAP) 
plot summarizing the predictions of the proba- 
bilistic mixture model for the CD11c* cells. Each 
cell is projected onto the two-dimensional sphere 
according to the posterior probability of its asso- 
ciation with the model's classes. The dimensions 
of the CAP plot should not be interpreted linearly 
or as principal components. (C) Bar plots depict- 
ing similarities of mean RNA counts in inferred 
types and Immgen expression profiles. The most 
correlated group of Immgen profiles is colored 
specifically as indicated for each type. (D) Shown 
are CAP plots depicting single-cell RNA-seq data 
sets acquired from marker-based FACS sorting for 
single pDCs, B cells, NK cells, and monocytes. Sorted 
cells are shown in red; density of the CD11c* pool 
is shown in gray. 
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(MARS-Seq) (figs. S1 to S6) (2/) that is designed 
for in vivo sampling of thousands of cells by multi- 
plexing RNA-seq while maintaining tight control 
over amplification biases and labeling errors. The 
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Fig. 3. Response to LPS across multiple cell types. 
(A) Inferred cell-type frequencies before and after LPS 
treatment. (B) Clustering of more than 1300 genes give 
mean inferred transcriptional mean in each cell type 
before and after LPS infection (+). Full gene list is 


provided in table S4. 


Inferred DC (class VI-VII, no LPS) > 


Inferred DC (class VI-VII, no LPS) 


Pearson 
-0.5 ES +0.5 


Fig. 4. Gene modules and the distribution and redistribution of DC 
states. (A) Single-cell correlation matrix for cells classified as DCs, showing 
detected subclasses using white frames. (B) Type/class distributions of single- 
cell RNA-seq data from three different FACS-sorted DC (CD11c enriched) 


method is based on fluorescence-activated cell 
sorting (FACS) of single cells into 384-well plates 
and subsequent automated processing that is done 
mostly on pooled and labeled material, leading to 
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a dramatic increase in throughput and reproduc- 
ibility. To explore the new technique, we sequenced 
RNA from more than 4000 mouse spleen single 
cells (table S1), focusing initially on a heteroge- 
neous cell population enriched for expression of 
the CD11c surface marker. We hypothesized that 
this strategy for cell acquisition will sample a diverse 
collection of splenic cell types while focusing on 
the challenging DC populations (6, 7). 

Our methodology uses three levels of barcod- 
ing (molecular-, cellular-, and plate-level tags) to 
facilitate molecule counting with a high degree of 
multiplexing. The strategy is to characterize cell 
subpopulations first by classifying single cells on 
the basis of low-depth RNA sampling and then 
studying transcriptional profiles at high resolution 
by integrating data from dozens to hundreds of 
cells within each unsupervised class. As shown in 
Fig. 1A, multiplexing 1536 cells in one sequenc- 
ing lane provided an average of 22,000 aligned 
reads per cell, and after extensive normalization, 
these can be used to unambiguously define 200 to 
1500 distinct RNA molecules from each cell. Our 
labeling and filtering scheme ensures that spiked- 
in technical controls show cell-to-cell variance 
that is compatible with the theoretical (binomial) 
sampling noise, comparing favorably with previ- 
ously reported techniques (Fig. 1B) (/8). This 
technical stability substantially increases the in- 
formation content of the sampled transcriptional 
states, which can be directly modeled as unbiased 
samples of the cells’ mRNA pool. In contrast to 
technical spike-in controls or the bulk of detected 
genes, we observed high cellular variance for a 
large number of genes, many of which are well 
known cell type-specific markers, suggesting that 
this attests for the high degree of heterogeneity 
within the splenic cell population (Fig. 1B) and 
promoting the idea of classifying cells into sub- 
populations on the basis of covariation of such 
heterogeneous markers. 
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populations: CD8a* CD86"; CD8a intermediate (int) CD86 ; and CD8a_ CD4* 
ESAM* (fig. S13A). (C) Gene correlation matrix is depicting potential LPS- 
dependent interactions between 225 genes. Key genes are indicated, with the 
complete list available in fig. $15. 
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To test how sensitive our strategy can be for 
characterizing the transcriptional state of sub- 
populations in the sample, we estimated coverage 
and mean mRNA molecule count reproducibility 
for groups of 10 to 40 single-cell profiles, repre- 
senting 0.6 to 2.4% of the cells on one sequencing 
lane. Analysis of single-cell mRNA profiles from 
FACS-sorted plasmacytoid DCs (pDCs) (Fig. 1C 
and fig. S6) confirmed that pooling of homoge- 
neous cell populations provides rich and highly 
reproducible transcriptional profiles. For a sub- 
population at a frequency of 2.5%, the assay report 
on 1255 genes with a standard deviation of less 
than 35% of the mean, and on 324 genes with a 
standard deviation of 20% of the mean. Together, 
the availability of high-variance marker genes and 
the dynamic range provided by pooled single-cell 
transcriptional profiles enable unsupervised dis- 
section and characterization of heterogeneous cell 
populations, opening the way for ab initio cell- 
type decomposition of splenic populations at a high 
level of detail. 

We have implemented a probabilistic strategy 
for unsupervised classification of cells into “‘ideal- 
ized types.” Hierarchical clustering (Fig. 2A) defined 
seeds of highly correlated cells, leading to the ini- 
tialization of a probabilistic mixture model and 
classification of single cells into types or families 
of homogeneous states. Visualization of the multi- 
class data by using a new circular a posteriori 
projection technique (Fig. 2B) represented the 
splenic cell population as a combination of several 
molecular behaviors, five of which (classes I to V) 
being distinctively separated from a group of more 
loosely defined classes (classes VI and VII). The 
frequencies of classes I to V range between 3.7 
and 17%, allowing in all cases to infer rich tran- 
scriptional states through in silico pooling of single- 
cell mRNA profiles within each class. Analysis of 
gene enrichment (table S2 and figs. S7 and S8) and 
comparison of these profiles with existing tran- 
scriptional profiles of classical hematopoietic pop- 
ulations (www.immgen.org), unambiguously linked 
classes I to V to B cells, natural killer (NK) cells, 
macrophages, monocytes, and pDCs (Fig. 2C). The 
remaining classes were all linked to DCs. FACS 
analysis using classical surface markers confirmed 
our in silico estimations of the frequency of B 
cells and pDCs within the CD11c-enriched 
splenic cell population (fig. S9). Further analysis 
and additional single-cell quantitative polymer- 
ase chain reaction experiments confirmed that 
“marker” genes are robustly enriched in their 
relevant subpopulations (figs. S10 and S11). 
Using classical marker-based sorting, we fur- 
ther validated our approach with additional 
single-cell RNA-seq data from FACS-sorted B 
cells, NK cells, pDCs, and monocytes. Projection 
of the new data onto the model we generated from 
the splenic population showed remarkable com- 
patibility between the traditional marker-based 
cell-type definition and the marker-free single-cell 
RNA-seq technique (Fig. 2D). Analysis of splenic 
cell populations therefore showcased single-cell 
RNA-seq as a direct and unsupervised way for 
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identifying and characterizing subpopulations 
within heterogeneous tissues. 

We profiled additional 1536 single cells from 
spleens that were exposed to lipopolysaccharide 
(LPS) for 2 hours (22), aiming to test how an im- 
mediate response to an infection-mimicking stim- 
ulus can be deciphered across the heterogeneous 
splenic cell population. We found that the LPS- 
treated cells are broadly classified into similar cell 
types to those observed in untreated cells, with 
some changes in the relative representation of dif- 
ferent types (Fig. 3A). Using the non-LPS mixture 
model, we classified the non-LPS and LPS-exposed 
cells into classes and inferred a rich transcriptional 
profile within each class before and after treatment. 
Clustering 1575 variable genes identified groups of 
cell type-specific response genes (such as 7nf and 
Marco in macrophages and Xc// and Gzmb in NK 
cells), and a large group of type I interferon response 
genes (Irf7, Stat2, [fit], Cxcl10, and hundreds more) 
activated pervasively in all or almost all cell types 
(Fig. 3B, fig. S12, and tables S3 and S4). 

With thousands of samples readily available, 
single-cell RNA-seq is poised to go beyond the 
classical cell-types hierarchies that are outlined by 
current marker-based approaches, examining com- 
plex relations between cell subpopulations or con- 
tinuous spectra of types. Analysis of 1031 single 
cells that were associated with DC-related classes 
(VI and VII) in our unsupervised CD11c* model 
(Fig. 4A) indicated that although 15% of these 
cells (class DC1) are strongly linked together, the 
remaining bulk of DCs could not be organized 
along a clear clustering hierarchy (//). Nevertheless, 
we found strong support for substantial internal 
organization within the remaining DC population 
(DC2 to DC4) (table S5), including a group of 
cells coexpressing Relb, Nfkbia, and additional 
associated genes (DC2) (fig. S13). More gener- 
ally, we have identified several gene modules that 
represent combinatorial pathway activity within 
the DC bulk (fig. S14), indicating that despite the 
lack of clear hierarchy, the DC cell population is 
governed by a high degree of transcriptional or- 
ganization. Additional single-cell sequencing of 
CD8* CD86*, CD8™ CD86, and CD4* FACS- 
sorted populations (Fig. 4B) showed that this or- 
ganization can be approached to a limited extent 
with existing marker-based classification. Remark- 
ably, exposure to LPS reorganizes the DC popula- 
tion substantially, with a large number of gene 
modules being activated in a highly heterogeneous 
fashion (Fig. 4C and fig. S15). According to our 
analysis, certain specific CD4" DC subpopula- 
tions are activating the Irf4, tumor necrosis factor, 
and transforming growth factor-B pathways (fig. 
S16 and table S6), whereas other pathways (such 
as Irf7) are activated pervasively (table S5). This 
combinatorial activity of pathways within the LPS- 
exposed DC pool is not represented in preexisting 
DC subtypes according to our data. Committed and 
developmentally stable myeloid and lymphoid cell 
types maintain their identity during immediate 
response to infection while responding through 
generic and cell type-specific pathways. These 
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pathways create substantial cell-to-cell variance 
and define new cell subpopulations within each 
of these cell types (fig. S17), forming diversity 
that may have functional implications. Observa- 
tion of transcriptional subpopulations, however, 
does not necessarily imply the existence of fur- 
ther committed and preprogrammed cell subtype 
hierarchy. 

We presented this framework for broad sam- 
pling of single-cell transcriptional states from 
tissues and demonstrated how it can be used to 
dissect complex functions in a bottom-up fash- 
ion. MARS-seq can be readily applied to tissues 
and organs in normal and disease states to re- 
define their cell-type and cell-state compositions 
and link it to detailed genome-wide transcrip- 
tional profiling. Given the inherent stochasticity 
and heterogeneity of multicellular tissues, this ap- 
proach can prove essential for understanding how 
in vivo biological function emerges from complex 
cell ensembles. 
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Leaf Shape Evolution Through 
Duplication, Regulatory Diversification, 
and Loss of a Homeobox Gene 
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In this work, we investigate morphological differences between Arabidopsis thaliana, which has 
simple leaves, and its relative Cardamine hirsuta, which has dissected leaves comprising distinct 
leaflets. With the use of genetics, interspecific gene transfers, and time-lapse imaging, we 

show that leaflet development requires the REDUCED COMPLEXITY (RCO) homeodomain protein. 
RCO functions specifically in leaves, where it sculpts developing leaflets by repressing growth 

at their flanks. RCO evolved in the Brassicaceae family through gene duplication and was lost in 
A. thaliana, contributing to leaf simplification in this species. Species-specific RCO action with 
respect to its paralog results from its distinct gene expression pattern in the leaf base. Thus, 
regulatory evolution coupled with gene duplication and loss generated leaf shape diversity by 
modifying local growth patterns during organogenesis. 


nderstanding how form evolves requires 
| | identifying the genetic changes under- 
lying morphological variation between 
species and elucidating how those changes in- 
fluence morphogenesis. In this work, we inves- 


tigate this problem in the case of angiosperm 
leaves. Both simple and dissected leaves initiate 


as entire structures at the flanks of the pluripotent 
shoot apical meristem (/). However, in dissected 
leaves, elaboration of lateral growth axes after 
leaf initiation generates leaflets (2, 3). So far, no 
gene has been identified that expresses specifi- 
cally at developing leaflets and is sufficient to 
convert a simple leaf into a more complex one. 


Fig. 1. Mapping of RCO and complementation of 
the rco mutant. (A to C) Silhouettes of an A. thaliana 
simple leaf (A) with small marginal protrusions called 
serrations (red asterisk); a C. hirsuta dissected leaf (B) with 
lateral leaflets (black arrow) borne by petiolules (black 
arrowhead) and a terminal leaflet (TL); and a C. hirsuta rco 
mutant leaf (C), in which leaflets are converted to lobes D 
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Rather, a key factor in leaflet formation is the 
reactivation of meristem genes in leaves, which 
suggests that evolutionary differences in leaf com- 
plexity arose through modification of activity of 
genes that influence meristem function (4). This 
view is also consistent with the evolutionary or- 
igin of leaves from branched shoots (5). To deter- 
mine whether leaflet-specific factors exist, we 
conducted a genetic screen in Cardamine hirsuta, 
a dissected-leaf relative of the simple-leaf refer- 
ence plant Arabidopsis thaliana (Fig. 1, A and B) 
(6). If such genes exist, then loss of their function 
might prevent leaflet formation without perturbing 
meristem function. 

The recessive mutant, rco (reduced complex- 
ity), converts the C. hirsuta adult leaf from dis- 
sected into a simple lobed leaf (Fig. 1, B and C) 
without affecting the number and positioning of 
leaves. Thus, RCO is required for leaflet devel- 
opment but not leaf initiation (fig. S1, A to G). 
RCO is a homeobox gene present in the genome 
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(red arrowhead). (D) Schematic representation of genes ATSC OSES 10 CU es ROO EMIT HKe2) 
in the RCO genetic interval predicted by sequence 
similarity to the A. thaliana genome. C. hirsuta orthologs ‘AT5G03840’ ‘AT5G03830° ‘AT5G03800° LMI1-like3 ‘AT5G03790° 


are indicated with inverted commas; interval borders are 
marked with black flags; and shading indicates genes E 
absent in A. thaliana. (E) Alignment of proteins encoded 


. . . ° ChLMI1 MEWSTTSN& 
by LMIL and LMl/1-like genes in A. thaliana and C hirsuta, ALMI f ews TT Si 
respectively. The red arrow marks the last amino acid residue °° MEWS T TSN 


(Val?4) of the truncated protein encoded by the rco mutant 

transcript; horizontal lines indicate the homeobox (red line) Sen | 
and homeobox-associated leucine zipper (blue line) do- RCO 
mains. Amino acid residues are shown as single-letter 
abbreviations (25). (F to I) (F). Complementation to WT = Stim 
morphology (F) of the rco mutant phenotype (G) by transgenic Rc 
expression of RCO (rco; RCOg) (H) but not ChLMI1 (rco; 


ChLMI1g) (1) genomic fragments. Fourth and fifth leaves chia «REESE 


AtLMI1 MNAI 


are shown. Scale bars in (A to C) and (F to |), 1 cm. oon REE amas. ‘- 


ChLMI1 (ChLMI1-like1) 
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Fig. 2. The RCO expression 
pattern underlies its ability 
to promote leaf complexity. 
(A to G) Complementary ex- 
pression patterns of RCO and 
ChLMI1 shown by RCO::GUS 
(A and B) and ChLMI1::GUS 
(E and F) reporter gene anal- 
ysis and by RNA localization 
of RCO (C and D) and ChLM/1 
(G) at the shoot apex and fifth 
leaf of C. hirsuta. (H to K) GUS 
staining in AtLMI/1::GUS (bh), 
RCO::GUS (I), ChLMI1::GUS 
(J), and AaLMI1::GUS (K) 
A. thaliana leaves. (L) AaLMI1 
RNA localization in vegeta- 
tive Aethionema arabicum 
leaf. (M to O) Phenotype of 
leaves four to six of C. hirsuta 
wild type (M), rco mutant (N), 
and rco mutant complemented 
with an RCO::ChLMI1 trans- 
gene (0). (P to R) Rosettes of 
A. thaliana wild type (P) and 
transgenic RCOg (Q) and AIRCOg 
(R) plants. (S) GUS staining in 
AIRCO::GUS A. thaliana leaf. 
Asterisks indicate stipules. RNA 
localization images are mini- 
mal projections. Scale bars, 
100 um in (A) to (L) and (S); 
1 cm in (M) to (R). 


Fig. 3. RCO represses growth 
at the boundary between 
leaflets and does not influ- 
ence auxin-based patterning. 
(A and B) DR5::VENUS expres- 
sion (yellow) and chlorophyll 
autofluorescence signal (blue) 
in the seventh leaf of WT (A) 
and rco (B) C. hirsuta. (C to H) 
Time-lapse of developing lat- 
eral leaflets in WT (C to E) and 
rco (F to H) fifth leaves. Pro- 
pidium iodide—stained leaf 
cells (green) are shown in (C) 
and (F) for each time point. 
Heat maps of relative surface 
area increase over 48 hours of 
growth (color bar: percentage 
increase) for lateral leaflets 
are shown in (D) and (G). Heat 
maps of cell proliferation over 
48 hours [color bar: number of 
cells (n) originating from one 
initial cell] for lateral leaflets 
are shown in (E) and (H). White 
dotted lines denote leaf mar- 
gins; white dotted rings indicate 
areas with excess growth and 
cell proliferation in the rco mu- 
tant. Scale bars, 100 um in (A) 
and (B); 30 um in (©) to (H). 
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of C. hirsuta but absent in A. thaliana, and the 
rco phenotype is caused by reduced gene func- 
tion (Fig. 1, D to H, and fig. S1, H and I). More- 
over, RCO is part of a tandem gene triplication, 
but the 4. thaliana genome has only one of these 
genes, LMI] (LATE MERISTEM IDENTITY 1), 
previously identified as a floral regulator (Fig. 1D) 
(7). Phylogenetic analysis revealed that RCO 
arose from the duplication of LM/1-type sequences 
within the Brassicaceae after the divergence of 
Aethionema and before the last common ancestor 
of Arabidopsis and Brassica (fig. $2). RCO du- 
plicated further, yielding a cluster of three genes in 
C. hirsuta, A. lyrata, and Capsella rubella, though 
LMU-like3 is not expressed in C. hirsuta, and con- 
certed evolution may have influenced this gene 
cluster (fig. S1, I; fig. S2; and fig. S3). Secondary 
loss of RCO-type sequences left LMI/ as a 
singleton in the A. thaliana lineage (fig. S3). We 
found that simply increasing the dose of LM/I- 
type protein is not sufficient to suppress rco or 
increase leaf complexity in A. thaliana (Fig. 11 
and fig. S4), suggesting that RCO function in 
leaflet development is borne out of its specific 
gene expression properties and/or protein func- 
tion. Thus, RCO is a taxonomically restricted gene 
underlying a species-specific trait that distin- 
guishes two species that diverged relatively re- 
cently (8-10). 

We used reporter gene assays and RNA in situ 
hybridization to determine the expression pat- 
tern of RCO (Fig. 2, A to D, and fig. S5, A to F). 
RCO expression is restricted to developing leaves, 
in two small regions at the base of terminal and 
lateral leaflets, and is absent from the meristem- 
leaf boundary (Fig. 2, A to D; fig. S5, A to F; and 
fig. S6, A to C). By contrast, ChLM1/ is expressed 
in a near-complementary pattern to RCO in ter- 
minal and lateral leaflet margins and also in stip- 
ules and flowers, similar to its 4. thaliana ortholog 
(Fig. 2, E to H; fig. SSG; and fig. S6, D to F) (7). 
Thus, RCO activity at the base of leaflets is re- 
quired for leaflet formation, and its domain of 
expression distinguishes RCO from its paralog 
ChLMI1. The orthologous genes ChLMI/ and 
AtLMI]/ share comparable expression domains, 
indicating that the distinct expression pattern of 
RCO reflects regulatory diversification from 
ChLMU (fig. S7). Consistent with this view, RCO 
5’ regulatory sequences drive reporter gene ex- 
pression in more proximal and internal regions 
of the A. thaliana leaf than those of ChLMIJ, 
AtLMII, and Aethionema arabicum LMI] (Fig. 2, 
H to L; fig. S8, A to D; and fig. $9). Because 
LMI transcripts do not accumulate at the leaf 
base of A. arabicum, which is an early divergent 
member of the Brassicaceae, these comparisons 
indicate that the RCO expression pattern repre- 
sents an evolutionary novelty that arose after gene 
duplication through neofunctionalization (//). 

To evaluate whether protein sequence speci- 
ficity also contributes to RCO function, we ex- 
pressed ChLMII under the regulatory regions 
of RCO and found that this transgene comple- 
mented the rco mutant phenotype (Fig. 2, M 


14 FEBRUARY 2014 


REPORTS 


781 


REPORTS 


782 


to O). Thus, RCO and ChLMI] proteins are func- 
tionally equivalent in this developmental context, 
and these results suggest that the species-specific 
action of RCO in leaflet formation reflects di- 
versification of gene expression from its paralog 
ChLMII. The absence of RCO, a leaf complex- 
ity gene, from the A. thaliana genome suggests 
that RCO plays a key role in shaping leaf diver- 
sity. If so, and given that the simple leaf shape 
of A. thaliana is evolutionarily derived (72), in- 
troducing RCO into A. thaliana should reverse 
some of the effects of evolution and increase leaf 
complexity. As predicted, A. thaliana transgenic 
lines carrying a C. hirsuta RCO genomic clone 
(RCOg) produced deep lobes never seen in the 
wild type, suggesting that RCO is sufficient to 
increase A. thaliana leaf complexity (Fig. 2, P 
and Q, and fig. S10). These transgenic lines lacked 
pleiotropic effects, consistent with a specific role 
of RCO in leaf margin morphogenesis (fig. S10). 
Furthermore, introducing a genomic clone of the 
RCO ortholog of A. lyrata (AIRCOg), a lobed leaf 
species, into A. thaliana also produced lobed 
leaves (Fig. 2R). This phenotype is likely to re- 
sult from AIRCO protein activity in the leaf 
base because the A/RCO::GUS reporter is ex- 
pressed in a similar domain to the C. hirsuta 
RCO gene (Fig. 2S and fig. S8D). The finding 
that RCO-::GUS and AIRCO::GUS are expressed 
in the A. thaliana leaf base, despite the absence 
of an RCO-type gene, suggests that at least part 
of the ancestral regulatory landscape that pro- 
motes leaf complexity through RCO activation 
has been retained in this species and highlights 
the importance of the leaf base as an organizing 
region for leaf growth (/3). Collectively, these 
results indicate that localized RCO action is a 
key factor in determining leaf shape complex- 
ity in the Brassicaceae, and loss of this gene con- 
tributed to leaf simplification in A. thaliana. 

We next considered how RCO regulates leaf- 
let development. Auxin is required for leaflet de- 
velopment, and failure to organize discrete auxin 
activity maxima along the leaf margin reduces 
leaf complexity (3, 14). However, the auxin ac- 
tivity marker DR5 and the auxin efflux carrier 
PINFORMED! were similarly expressed in leaf 
primordia of rco and the wild type, with se- 
quential discrete auxin activity maxima form- 
ing at the leaf margin in both lobes and leaflets 
(Fig. 3, A and B, and fig. S11). Thus, RCO does 
not contribute to the establishment or mainte- 
nance of local auxin activity maxima that control 
leaflet initiation. RCO is expressed immediately 
adjacent to leaflet primordia (Fig. 2, A to D, and 
fig. S5, A to F), so drawing on classic ideas of 
leaf shape patterning (15), we hypothesized that 
RCO may influence growth locally to enable sep- 
aration of individual leaflets. To test this hypoth- 
esis, we used MorphoGraphX software to analyze 
time-lapse images of leaflet growth (/6) (Fig. 3, C 
to H). In wild-type (WT) plants, we observed that 
cell expansion and proliferation is inhibited in the 
marginal region between initiating leaflets (Fig. 
3, D and E, and fig. S12, A to C). By contrast, in 


the vco mutant these cells proliferate and grow 
faster than in the wild type, filling up the space 
between leaflets (Fig. 3, G and H, and fig. S12, A 
to C). Conversely, cells within leaflets grow and 
proliferate fast in both genotypes (Fig. 3, C to 
H, and fig. $12, A to C). Thus, rco mutant 
leaflets initiate and grow in a comparable way 
to the wild type but fail to separate properly from 
each other due to incomplete growth repression 
at their boundaries, resulting in a simplified leaf. 
We also analyzed A. thaliana leaves expressing 
RCOg and found significant repression of growth 
and cell proliferation adjacent to emerging 
serrations that transformed these small protrusions 
into deep lobes (fig. S12, D to F, and fig. S13). 
Thus, RCO contributes to growth repression 
between adjacent leaflets. 

We found that ChLMI/ rescues the rco phe- 
notype when expressed from the RCO promoter 
and the smooth leaf margin of A. thaliana Imil 
when expressed in its genomic context (Fig. 20 
and fig. S14). Growth derepression probably con- 
tributes to this /miJ phenotype and that of a 
classical pea mutant, in which mutation in an 
unusual LM//-like gene converts filamentous leaf 
tendrils into laminate leaflets (7, 77). Additionally, 
both LMI// and RCO repress growth when over- 
expressed in A. thaliana (fig. S15). To understand 
the degree of conservation of the growth-regulating 
function of LM/- and RCO-type genes during 
crucifer evolution, we evaluated the ability of 
selected genes from the phylogeny (fig. S2) to 
modify A. thaliana leaf shape when expressed 
under the control of the RCO promoter. With 
the exception of the RCO-B gene of C. rubella, 
all sequences assayed produced deep lobes in 
A. thaliana \eaves (fig. S16). Because these se- 
quences included LMI/ from the early divergent 
crucifer Aethionema and the basal eudicot Aquile- 


Fig. 4. Evolution of RCO and its con- 
sequences for diversification of leaf shape 
in crucifers (based on phylogeny presented 
in figs. S2 and $3). The genome of Ae- 
thionema arabicum, a simple-leaf early di- 
vergent crucifer, contains a single LM//1-type 
gene. RCO-type genes arose from duplication 
of an LM/1-type gene after the divergence of 
Aethionema from core Brassicaceae. The ge- 
nomes of dissected-leaf C. hirsuta and lobed- 
leaf A. lyrata contain both LM/1-type and 
RCO-type genes. In the lineage that gave 
rise to A. thaliana, the RCO-type gene was 
secondarily lost, contributing to evolution 
of a simple leaf. Consistent with a role for 
RCO in promoting leaf complexity, removing 
its activity from C. hirsuta in the rco mutant 
leads to leaf simplification, whereas intro- 
ducing RCO-type function into A. thaliana 
results in a more complex leaf. Silhouettes 
are from adult leaves and are not to scale. 
A. lyrata and C. hirsuta also contain LM/1- 
like3, a third copy in this gene cluster that 


gia, it follows that the potential for LMI protein 
to repress growth evolved before the appearance 
of RCO in the Brassicaceae, and probably before 
the split of eudicots from other seed plants. With- 
in the Brassicaceae, evolution of RCO through 
gene duplication created a new version of these 
growth repressors that is active in the morphoge- 
netically important leaf base, thus contributing to 
diversification of leaf shape (Fig. 4). 

Leaflet formation in C. hirsuta requires the 
RCO homeobox gene that arose through gene 
duplication and is only expressed at initiating 
leaflets. Thus, evolutionary changes in leaf com- 
plexity can arise through factors distinct to those 
acting at the shoot apical meristem. RCO does 
not appear to act through the well-characterized 
auxin-based patterning that underpins leaflet and 
serration formation (Fig. 3, A and B, and fig. 
S11), or to regulate transcription of CUP-SHAPED 
COTYLEDON (CUC) or KNOTTED 1-like homeobox 
(KNOX) genes that influence this patterning (fig. 
S17) (3, 18, 19). One possibility is that RCO acts 
parallel to or downstream of these genes to regu- 
late leaf complexity, thus providing a means to 
uncouple growth and patterning inputs during 
evolution. RCOg expression in A. thaliana leaves 
transforms serrations into deep lobes by locally 
repressing growth adjacent to each serration (fig. 
S13); however, complete transformation into leaf- 
lets may require other genes that are active in 
C. hirsuta but not in A. thaliana leaves, such as 
CUCI or KNOX genes (20, 21). RCO-mediated 
shape diversification follows a broad principle 
of regulatory evolution: that morphological di- 
versity is driven by changes in gene expression 
that minimize fitness costs by circumventing 
pleiotropy (22, 23). It will be interesting to ex- 
plore whether the high tendency toward gene 
duplication in plants (24) was a major driver for 
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arose through duplication of RCO-type but is not shown in this diagram because it is not expressed in 
C. hirsuta (fig. $1) and has not been characterized functionally. 


14 FEBRUARY 2014 VOL 343 SCIENCE www.sciencemag.org 


evolution of regulatory variants that underlie 
trait diversification between species, as we have 
shown here for RCO. 
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A Viral RNA Structural Element Alters 
Host Recognition of Nonself RNA 


Jennifer L. Hyde,? Christina L. Gardner,” Taishi Kimura,? James P. White,” Gai Liu,° 
Derek W. Trobaugh,” Cheng Huang,’ Marco Tonelli,° Slobodan Paessler,* Kiyoshi Takeda,? 
William B. Klimstra,? Gaya K. Amarasinghe,°> Michael S. Diamond?>:”* 


Although interferon (IFN) signaling induces genes that limit viral infection, many pathogenic 
viruses overcome this host response. As an example, 2'-O methylation of the 5’ cap of viral 

RNA subverts mammalian antiviral responses by evading restriction of /fit1, an IFN-stimulated 

gene that regulates protein synthesis. However, alphaviruses replicate efficiently in cells expressing /fit1 
even though their genomic RNA has a 5’ cap lacking 2'-O methylation. We show that pathogenic 
alphaviruses use secondary structural motifs within the 5’ untranslated region (UTR) of their RNA to 
alter /fit1 binding and function. Mutations within the 5’-UTR affecting RNA structural elements 
enabled restriction by or antagonism of Ifit1 in vitro and in vivo. These results identify an evasion 
mechanism by which viruses use RNA structural motifs to avoid immune restriction. 


ukaryotic mRNA contains a 5’ cap struc- 
ture with a methyl group at the N-7 position 
(cap 0). In higher eukaryotes, methylation 
also occurs at the 2'-O position of the penultimate 
and antepenultimate nucleotides to generate cap 
1 and 2 structures, respectively. Many viral mRNAs 
also display cap 1 structures. Because cytoplasmic 
viruses cannot use host nuclear capping machinery, 
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some have evolved viral methyltransferases for N-7 
and 2'-O capping or mechanisms to “steal” the cap 
from host mRNA (/). Whereas N-7 methylation of 
mRNA is critical for efficient translation (2), cy- 
toplasmic viruses encoding mutations in their vi- 
ral 2’-O-methyltransferases are inhibited by IFIT 
proteins (3—7), which belong to a family of in- 
terferon (IFN)-stimulated genes (ISGs) induced 
after viral infection [reviewed in (8)]. Thus, 2’-O 
methylation of host mRNA probably evolved, in 
part, to distinguish self from nonself RNA (9, /0). 

Alphaviruses are positive-strand RNA viruses 
that replicate in the cytoplasm and lack 2’-O 
methylation on the 5’ end of their genomic RNA 
(11, 12) and thus should be restricted by IFIT 
proteins. To assess the role of IFIT1 in limiting 
alphavirus replication, we silenced its expression 
in human HeLa cells and then infected them with 
Venezuelan equine encephalitis virus (VEEV) strain 
TC83, an attenuated New World alphavirus. In 
cells with reduced IFIT1 expression, TC83 rep- 
licated to higher levels (Fig. 1A). To determine 


whether this phenotype occurred in vivo, wild-type 
(WT) and /fitl * C57BL/6 mice were infected with 
TC83. In contrast to WT mice, Jfit]” mice suc- 
cumbed to TC83 infection (Fig. 1B) and sustained 
a higher viral burden (Fig. 1, C and D, and fig. S1), 
especially in the brain and spinal cord. 

We next analyzed the growth of TC83 in 
mouse embryonic fibroblasts (MEFs). Although 
untreated WT and /fitl’’” MEFs supported TC83 
infection equivalently (Fig. 1E), IFN-B pretreatment 
preferentially inhibited replication in WT cells. 
However, an absence of /fit] was sufficient to 
restore infection. A similar trend was observed with 
Tit] ~ dendritic cells and cortical neurons (fig. S2, 
A and B). TC83 infection in Jfit!’’~ MEFs re- 
mained partially inhibited by IFN-f treatment, 
indicating that additional ISGs restrict viral repli- 
cation (/3—/5). The similarity of infection by 
TC83 in untreated WT and /fit!’’ MEFs prob- 
ably reflects the ability of alphaviruses to antag- 
onize the induction of type I IFN and ISGs (6, 77). 

TC83 was generated after passage of the vir- 
ulent Trinidad donkey (TRD) VEEV strain and 
contains two changes that attenuate virulence 
(78). One mutation occurs at nucleotide 3 (nt 3, 
G3A) in the 5'-UTR and increases the sensitivity 
of TC83 to type I IFN (77). We hypothesized that 
the 5’-UTR mutation might explain the differen- 
tial sensitivity to [fit] and the pathogenicity of 
TC83 and TRD. To begin to test this hypothesis, 
WT and Jfit!’’ mice were infected with TRD 
(Fig. 1F). WT and /fit] ~~ mice succumbed to TRD 
infection without differences in survival time or 
mortality. Thus, in contrast to TC83, TRD was 
relatively resistant to the antiviral effects of /fit1. 

To determine whether the effect of the G3A 
mutation was independent of the TC83 structural 
genes, which contain a second attenuating muta- 
tion (19), we assessed replication in WT and Jfitl 
MEFs of two isogenic chimeric VEEV/Sindbis 
(SINV) viruses (20); these encode the 5’-UTR 
and nonstructural proteins of TRD and structural 
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proteins of SINV, and differ only at nt 3 [(G3) 
VEE/SINV and (A3)VEE/SINV] (fig. S3, A and 
B). In IFN-B—pretreated WT MEFs, (A3)VEE/ 
SINV was not recovered from culture super- 
natants (Fig. 2A). However, in IFN-B-treated /fit] 7 
MEFs, (A3)VEE/SINV infection was partially re- 
stored. In contrast, (G3)VEE/SINV replicated equiv- 
alently in IFN-B—treated WT and /fitl '" MEFs (Fig. 
2B), indicating that a G at nt 3 renders VEEV re- 
sistant to inhibition by /fit/. 

RNA secondary structure algorithms predicted 
differences in base pairing at the 5’ end of the UTR 
of G3 and A3 RNA [fig. S3A and (20, 2/)]. The 
imino region of a two-dimensional nuclear Over- 
hauser effect spectroscopy nuclear magnetic reso- 
nance spectrum revealed that A3 RNA displayed 
less secondary structure and base pairing than G3 
RNA (fig. S4, A and B) and fewer cross peaks in 
the corresponding 'H/'*N heteronuclear single- 
quantum coherence (HSQC) spectrum (fig. S4, C 
and D). On the basis of these data, we hypoth- 
esized that the stable stem-loop structure in the 
5'-UTR of TRD compensated for the absence of 
2'-O methylation of alphavirus RNA. To deter- 


A shNSC 


mine whether the secondary structure or prima- 
ry sequence modulated /fit/ susceptibility, we 
analyzed the growth of VEE/SINV containing the 
A3 nt mutation that also had compensatory muta- 
tions that were predicted to restore the 5'-UTR 
stem-loop (Fig. 2, C and D, and fig. S3C). Al- 
though two of the mutants tested (A3U24 and 
A3U24;A20U) showed increased [relative to 
(A3)VEE/SINV] but limited growth in IFN-B— 
treated WT MEFs, a third mutant (A3U24;20 21insC) 
replicated to levels comparable to (G3)VEE/SINV 
in IFN-B-treated WT and Jit" MEFs. Mutants 
that replicated less well in IFN-$-treated WT 
MEFs (A3U24 and A3U24;A20U) were predicted 
to have less stable minimum free energy struc- 
tures relative to (A3U24;20_21insC)VEE/SINV 
and (G3)VEE/SINV. To further define the require- 
ments in the 5'-UTR for evasion of /fit/ restric- 
tion, we evaluated additional viral mutants: one 
that changed the sequence of the A3U24 loop 
but retained the less stable stem structure of the 
parent A3U24 5'-UTR [(LOOP)VEE/SINV] (22), 
and two G3 variants with more stable hairpins 
(G3;C19C20)VEE/SINV that contained additional 
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nucleotide repeats (AUG and AUG>) appended 
to the 5’ end (fig. SSA). The latter (AUG,)VEE/ 
SINV mutants were relevant because RNA rec- 
ognition by IFIT proteins reportedly requires a 5’ 
overhang of 3 to 5 nts (22). Alteration of the loop 
sequence [(LOOP)VEE/SINV] did not relieve /fit/- 
mediated restriction (fig. SSB). However, G3 mu- 
tants with an overhang of 3 or more nts at the 5’ end 
became sensitive to /fit/-dependent antiviral ef- 
fects (fig. SSC). 

To assess whether nucleotide changes altered 
the stability of the VEEV 5'-UTR, we monitored 
RNA unfolding by circular dichroism spectrometry 
(fig. S6). Changes in ellipticity as a function of 
temperature were analyzed (Fig. 2, E to I, and 
table S1); we observed several maxima, presumably 
corresponding to major cooperative unfolding events 
(Fig. 2, E to I). We detected more-pronounced 
maxima near 75°C in all but the A3 RNA, confirm- 
ing that A3 and G3 RNA have different stabilities. 
The A3U24;20 2linsC mutant RNA displayed 
the most stable secondary structure. Computa- 
tional analyses suggested that even closely related 
RNA sequences (such as A3 and A3U24) have 
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Fig. 1. VEEV TC83 but not TRD is restricted by /fit2. (A) Flow cytometry 
contour plots showing infection of TC83 in IFN-B—treated HeLa cells transduced 
with short hairpin RNA (shRNA) against a scrambled nonsilencing control (NSC), 
human STAT2, or human IFIT1 (shNSC versus shIFIT1, P < 0.003). One rep- 
resentative experiment of four is shown. This phenotype was confirmed with a 
second shRNA against IFIT1. (B) Survival of 4-week-old WT mice (n = 10) and 
ifit1- mice (n = 10) after subcutaneous (s.c.) infection with 10° focus-forming 
units (FFU) of TC83. Results are pooled from three independent experiments. 
P values for survival were calculated using the log-rank test. (C and D) Viral 
burden in 4-week-old WT or /fit2~~ mice infected s.c. with 10° FFU of TC83, as 
measured in (C) a draining popliteal lymph node (DLN) and (D) the brain. 
Results are from five to nine mice per tissue. Asterisks indicate statistically 


significant differences, as judged by an unpaired t test (**P < 0.005, ***P < 
0.0001). Dashed lines indicate the limit of detection of the assay. (E) WT and 
ifit’- MEFs were pretreated with 10 IU/ml of IFN-B for 12 hours or left 
untreated, and then infected with TC83 [with a multiplicity of infection (MOI) 
of 0.1]. Supernatants were harvested for virus titration (WT versus /fitl”, P > 
0.2; WT + IFN-B versus fit. + IFN, 12, 24, and 36 hours after infection, P < 
0.03). Each point represents the average of three experiments performed in trip- 
licate, and error bars represent the standard error of the mean (SEM). P values 
were determined by an unpaired f test. (F) Survival curves of 8-week-old WT mice 
(n = 10) and fit?” mice (n = 24) after s.c. infection with 50 plaque-forming units 
(PFU) of TRD. Results are pooled from two independent experiments. P values 
for survival were calculated using the log-rank test. 
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different ensemble free energy and diversity 
(table S2). Differences in the base pairing prob- 
ability were noted, which further support struc- 
tural differences between A3 and G3 RNA (fig. 
S7). We also measured melting temperature 
(Tm) values (table S1), which showed an inverse 
correlation between /fit] susceptibility and base- 
pairing stability. These analyses suggest that 
G3 and A3U24;20_21linsC 5’-UTR RNAs adopt 
more stable conformations, which correlates with 
antagonism of /fit/. 

To validate that changes at nt 3 determined 
sensitivity to /fit] independently of other VEEV- 
encoded factors, we repeated experiments with 
isogenic variants of TC83 and an enzootic VEEV 
strain, ZPC-738 (Fig. 3, A to D, and fig. S3D). 
Whereas TC83 replicated poorly in IFN-B—treated 
WT MEFs, the isogenic nt 3 mutant TC83 A3G 
showed increased replication (Fig. 3A), confirming 
that the A3G mutation confers resistance to type I 
IFN. However, unlike that seen with (G3)VEE/SINV 
(Fig. 2B), the phenotype of TC83 A3G in IFN-B— 
treated WT MEFs did not fully recapitulate the 
restoration seen in IFN-B-treated Jfit]~ MEFs 
(compare Fig. 3A to Fig. 3B), suggesting that 
additional viral elements may be inhibited by 
[fitl. Infection of the mutant ZPC-738 G3A in 
IFN-B-treated WT MEFs was decreased as com- 
pared to WT ZPC-738, whereas the infection of 
WTand G3A ZPC-738 was equivalent in IFN-B— 
treated /fit!’’” MEFs (Fig. 3, C and D). 


To assess whether nt 3 mutation reciprocally 
affects virulence, we infected WT and Jfit/ =i 
mice with TC83, ZPC-738, and paired isogenic 
variants (Fig. 3, E and F). In WT mice, ZPC-738 
G3A was attenuated as compared to the WT 
virus. However, no difference in mortality and 
only a small difference in survival kinetics were 
observed in fit!’ mice infected with ZPC-738 
WT or G3A. In comparison, we observed in- 
creased lethality in WT mice infected with TC83 
A3G relative to TC83. We also noted a slight 
decrease in the survival kinetics of Jit” mice 
infected with A3G as compared to TC83 WT, 
suggesting that the A3G change may have ad- 
ditional effects aside from antagonizing /fit] 
function. 

To determine whether structures in the 5'-UTR 
of other alphaviruses functioned analogously, we 
introduced mutations at either nt 5 or 8 into SINV 
(Fig. 3, G and H, and fig. S3E). These mutations 
were selected because they altered the virulence 
of SINV in rats (23, 24) and were predicted to 
change the 5'-UTR secondary structure (fig. S3E). 
An A~to-G substitution at nt 5 resulted in increased 
viral replication relative to that of the parental virus 
in IFN-B-pretreated WT MEFs but not in IFN-B— 
treated fit!’ MEFs, suggesting that the ASG 
phenotype was specific to /fit?. Conversely, a sub- 
stitution at nt 8 (G8U) resulted in a decrease in 
replication in IFN-$-treated WT MEFs relative to 
WT SINV, which was restored to comparable levels 
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in IFN-B-treated fit] ~“ MEFs. This experiment 
establishes that mutations within the 5’-UTR of an 
Old World alphavirus also affect /fit/ antagonism, 
suggesting that secondary structure at the 5'-UTR 
might be a more universal mechanism to circum- 
vent /fit/-mediated restriction. 

IFIT1 binds flavivirus RNA lacking 2'-O meth- 
ylation and blocks translation and binding of 
eukaryotic translation initiation factors (6, 7, 25). 
To determine whether /fit/ differentially affected 
translation of alphavirus RNA with different 5’-UTR 
RNA structures, we transfected type 0 capped WT 
and G3A mutant translation reporter RNA encod- 
ing a luciferase gene fused to nsP1 (fig. S3F) (26) 
into IFN-B-treated or untreated MEFs (Fig. 4, A to 
D). In WT MEFs treated with IFN-B (Fig. 4A), G3 
RNA exhibited greater translation reporter activity 
relative to A3 RNA. We also detected greater trans- 
lation of G3 reporter RNA in untreated WT MEFs 
(Fig. 4B), suggesting that basal /fit/ expression in 
these cells may limit A3 RNA translation. However, 
we observed a greater increase in A3 reporter RNA 
translation relative to G3 in Jit“ MEFs that were 
treated with IFN-8 or left untreated (Fig. 4, C and 
D). The higher level of A3 versus G3 RNA trans- 
lation in itl” MEFs was not unexpected, because 
(A3)VEE/SINV replicates more efficiently than 
(G3)VEE/SINV in cells lacking type I IFN induction 
(20). Although A3 RNA has a translation advan- 
tage in cells defective in innate immune responses, 
the G3 nucleotide confers resistance to /fit/. 
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Fig. 2. Mutations in the 5'-UTR determine /fitZ sensitivity in vitro. (A and 
B) Growth kinetics of (A3)VEE/SINV and (G3)VEE/SINV viruses in WT and /fit2~ 
MEFs. Cells were pretreated with 1 IU/ml of IFN-8 for 12 hours or left untreated, 
and then infected with (A3)VEE/SINV or (G3)VEE/SINV (MOI of 0.1). Supernatants 
were harvested at indicated times for virus titration [(A3)VEE/SINV: WT + IFN-B 
versus ifitr” + IFN-B, 36 and 48 hours after infection, P < 0.006]. Each point 
represents the average of three independent experiments performed in trip- 
licate, and error bars represent the SEM. P values were determined using an 
unpaired t test. Dashed lines indicate the limit of detection of the assay. 
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temperature (°C) 
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(C and D) Growth kinetics of (G3)VEE/SINV, (A3U24)VEE/SINV, (A3U24;A20U) 
VEE/SINV, and (A3U24;20_21insC)VEE/SINV viruses in WT (C) and /fit2~”~ (D) 
MEFs. Experiments and analysis were performed as in (A). (E to 1) Thermal 
denaturation of A3, G3, A3U24, A3U24;A20U, and A3U24;20_21insC RNA as 
measured by circular dichroism spectroscopy at 210 nm. RNA was heated from 
5° to 95°C at a rate of 1°C/min, and readings were collected every 1°C to 
monitor unfolding. Data are represented as the change in molar ellipticity as a 
function of temperature (d6/dT), and red arrows indicate major maxima. One 
representative experiment of two is shown. 
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Fig. 3. Mutations that alter the secondary structure of the 5'-UTR affect 
pathogenicity in vivo. (A to D) Growth kinetics of isogenic TC83 WT and A3G 
[(A) and (B)] or ZPC-738 WT and G3A [(C) and (D)] in WT and /fit2”~ MEFs. 
Cells were pretreated with 10 IU/ml of IFN-B for 12 hours (TC83) or 100 IU/ml 
of IFN-B for 8 hours (ZPC738), or left untreated, and then infected with the 
respective viruses (MOI of 0.1) [TC83 versus TC83(A3G): WT + IFN-B, 36 and 
48 hours after infection, P < 0.006; ZPC738 versus ZPC738(G3A): WT + IFN-, 
24 hours after infection, P < 0.0001]. Each point represents the average of 
two (ZPC-738) or three (TC83) independent experiments performed in trip- 
licate, and error bars represent the SEM. P values were determined using the 
unpaired t test. Dashed lines indicate the limit of detection of the assay. (E 
and F) Survival studies of isogenic ZPC-738 WT and G3A (E) and TC83 WT 
and A3G (F) viruses in WT and /fit2~’" mice. Mice were infected s.c. with 107 PFU 
of ZPC-738 (WT, n = 6; ifit1~, n = 15) or ZPC-738(G3A) (WT, n = 8; fit1, 
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n = 15) and 10° PFU of TC83 (WT, n = 18; /fit1~, n = 13) or TC83(A3G) (WT, 
n=21; ffit1”~, n = 8). ZPC738 versus ZPC738(G3A): WT mice, survival P = 0.0002; 
mean time to death (MTD) of 5.5 versus 8.3 days, P = 0.0002. ZPC738 versus 
ZPC738(G3A): ifitt mice, MTD of 4.0 versus 5.8 days, P < 0.0001. TC83 versus 
TC83(A3G): WT mice, survival P < 0.0001; TC83 versus TC83(A3G): ifitt- 
mice, MTD of 8.2 versus 6.3 days, P < 0.003. Experiments were performed 
twice for ZPC-738 viruses and four times for TC83 viruses. P values for survival 
were determined as in Fig. 1. P values for MTD were determined using an 
unpaired t test. (G and H) Growth kinetics of SINV Toto, A5G, and G8U SINV in 
WT (G) and /fit2~- (H) MEFs. Cells were pretreated with 1 IU/ml of IFN-6 for 12 
hours or left untreated, and then infected with the respective viruses at an MOI 
of 0.1. SINV Toto versus ASG: WT MEFs + IFN-B, P < 0.05; SINV Toto versus 
G8U, WT MEFs + IFN-B, P < 0.05. Experiments and analysis were performed 
as in (A). 


We hypothesized that alphavirus mutants with 
different 5'-UTR structural stabilities might in- 
teract with /fit/ in a manner that is less compatible 
with translation. We used electrophoretic mobility 
shift assays (EMSAs) (Fig. 4, E to G) to de- 
termine whether TRD 5'-UTR RNA containing 
an A3 or G3 and a type 0 cap differentially inter- 
acted with /fit] (Fig. 4E). We observed significant 
binding of /fit] to A3 RNA but less binding to 
G3 RNA, suggesting that the secondary struc- 
ture of the G3 RNA probably inhibited inter- 
action with /fit/. This conclusion was supported 
by dot-blot binding studies, which showed a 2- 
to 10-fold greater affinity [dissociation constant 
(Kp) ~30 nM] of cap 0 A3 RNA as compared to 
G3 RNA for /fit], depending on the incubation 
conditions (Fig. 4H and fig. S8). The binding of 
[fit] to cap 0 RNA was specific, as it was com- 
peted by excess unlabeled 5'-ppp A3 RNA (fig. 
S87). Exogenous 2’-O methylation of A3 and G3 
RNA, which generates a type 1 cap, resulted in less 


[itl binding (Fig. 4F), which agrees with flavi- 
virus studies (6, 7). When EMSA experiments 
were repeated in the absence of capping, TRD 
5'-UTR RNA containing an A3 or G3 and a free 
5'-ppp differentially and weakly recognized /fit] 
(Fig. 4G), which is consistent with experiments 
demonstrating that single-stranded RNA, but not 
double-stranded RNA containing a free 5’-ppp, is 
bound by IFIT1 (22). Excess A3 5'-ppp RNA com- 
pared to G3 5’-ppp RNA preferentially competed 
for [fit] binding to type 0 cap A3 RNA [inhibition 
constant (K; ) of 3 and 48 uM for A3 and G3 5'-ppp 
RNA, respectively; fig. S9]. These results suggest 
that secondary structure in the context of an un- 
capped RNA can alter /fit/ binding and may 
contribute to why negative-stranded viruses with 
5'-ppp genomic RNA and highly structured 
5'-UTRs (such as filoviruses) are resistant to type 
I IFN and /fit/-mediated control. Our results also 
establish that /fit/ has a higher affinity for RNA 
with a type 0 cap than with a free 5'-ppp moiety. 


Alphaviruses use a stable 5'-UTR stem-loop 
structure to antagonize /fit/ antiviral activity. Al- 
though some IFIT proteins bind 5'-ppp RNA 
(22, 27), it remains to be determined how /fit] 
differentially recognizes capped RNA that dis- 
plays or lacks 2’-O methylation and how alpha- 
virus 5'-UTR stem-loop structures affect this. 
Our experiments suggest that genomic RNA ele- 
ments can function to evade host cell—intrinsic 
immunity. Thus, structural elements in viral or 
virus-associated RNA can bind antiviral pro- 
teins irreversibly to block function (28, 29) or 
attenuate binding of host antiviral proteins. It is 
intriguing to consider that viral RNA structural 
elements that antagonize /fit] recognition may 
have become targets for other RNA sensors 
(such as RIG-I and MDAS). Finally, these results 
may be relevant to pharmaceutical approaches 
that use mRNA as therapeutics or vaccine design 
strategies for attenuating alphaviruses and other 
viruses. 
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Fig. 4. The nt G3 in the 5’-UTR evades translational inhibition by 
altering /fit1-RNA binding. (A to D) Luciferase assays of A3 and G3 TRD 
translation reporters. WT and /fit1~”~ MEFs were untreated or treated with 
100 IU/ml IFN-B for 8 hours, and then electroporated with in vitro syn- 
thesized and type 0—capped reporter RNA. Cell lysates were harvested at the 
indicated time points and assayed for luciferase activity. Each bar represents 
the average of four independent experiments performed in triplicate. WT 
MEFs + IFN-B: G3 versus A3, P < 0.0004; WT MEFs, no treatment: G3 versus 
A3, P < 0.005; ifita- MEFs + IFN-B, G3 versus A3, P < 0.05 (30, 60, and 
120 min). Error bars represent the SEM. P values were determined using an 
unpaired t test. (E to G) EMSA of A3 and G3 VEEV 5’-UTR RNA bound to 
recombinant /fit1. G3 and A3 VEEV 5’-UTR RNA were synthesized in vitro 
using T7 polymerase (5’-ppp) and then treated with (E) an N-7 methyl- 


guanosine capping reagent (cap 0), (F) an N-7 methylguanosine capping 
reagent and an exogenous 2’-O methyltranferase (cap 1), or (G) no enzymes 


0.01 04 1 
IFIT1 concentration (uM) 


(5’-ppp). All RNA was labeled with biotin and competed with 3 ug of homol- 
ogous unlabeled RNA. Cap O and cap 1 RNAs were heated at 95°C; 5’-ppp 
RNA was heated at 70°C, as no specific binding was observed after heating at 
95°C. Binding assays were performed with 1 wg of /fit1. EMSA data are 
representative of at least three independent experiments. Arrows indicate 
specific binding of RNA to /fit1, whereas asterisks indicate nonspecific binding 
(not competed with unlabeled RNA). G3 and A3 5’-ppp paired samples were 
run simultaneously on the same gel and cropped as individual panels for 
presentation purposes. (H) Quantification of /fit1-A3/G3 RNA binding by filter- 
binding assay at 4°C. The fraction bound of A3 cap 0 (black squares) and G3 
cap 0 (red squares) was normalized to maximum binding and plotted against 
Ifit1 concentration. Data from A3 (black) and G3 (red) were fitted using the 
Hill equation. A3 cap 0 Kp = 0.030 + 0.004 uM; G3 cap 0 Kp = 0.091 + 
0.007 uM. One representative experiment of three performed in triplicate is 
shown. 
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A Common Cellular Basis for 
Muscle Regeneration in Arthropods 


788 


and Vertebrates 


Nikolaos Konstantinides?’?* and Michalis Averof’'?7* 


Many animals are able to regenerate amputated or damaged body parts, but it is unclear whether 
different taxa rely on similar strategies. Planarians and vertebrates use different strategies, 
based on pluripotent versus committed progenitor cells, respectively, to replace missing tissues. 
In most animals, however, we lack the experimental tools needed to determine the origin of 
regenerated tissues. Here, we present a genetically tractable model for limb regeneration, the 
crustacean Parhyale hawaiensis. We demonstrate that regeneration in Parhyale involves 
lineage-committed progenitors, as in vertebrates. We discover Pax3/7-expressing muscle satellite 
cells, previously identified only in chordates, and show that these cells are a source of regenerating 
muscle in Parhyale. These similarities point to a common cellular basis of regeneration, dating 


back to the common ancestors of bilaterians. 


egeneration relies on specific popula- 
Re of progenitor cells, which serve as 

the source of new cells in the regenerated 
tissues. Progenitors may be undifferentiated stem 
cells or differentiated cells that have the capac- 
ity to dedifferentiate, proliferate, and rediffer- 
entiate to produce new functionally specialized 
cells (/). Their identity and degree of commit- 
ment are relevant for addressing fundamental 
questions in regenerative biology, such as the 
role of cellular memory and plasticity during 
regeneration. 

Although the capacity to regenerate is wide- 
spread in animals, the evolutionary origins of re- 
generation remain unexplored. Among animals 
with extensive regenerative abilities, progenitor 
cells have been identified only in planarians, ver- 
tebrates, and cnidarians. These animals use dif- 
ferent strategies to replace missing tissues. In 
planarians, all tissues regenerate from a common 
pool of pluripotent progenitor cells (2, 3), whereas 
in vertebrates, different cell types arise from dis- 
tinct progenitors (4-7). Cnidarians use progenitor 
cells that are specialized to different degrees in 
different species (8—/0). In most animal phyla, 
we lack the tools to rigorously identify these pro- 
genitor cells and to map their lineage commit- 
ments. Thus, we do not know if similar types of 
progenitors exist across diverse phyla, whether 
there are shared regenerative strategies, and which 
of these strategies is most ancient. 

Here, we establish the crustacean Parhyale 
hawaiensis as a genetically tractable model for 
limb regeneration. Parhyale can fully regener- 
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ate all amputated appendages throughout their 
lifetime (Fig. 1A). Regeneration restores all of 
the cell types that can be observed in adult 
limbs, including epidermis, neurons, and mus- 
cles. Several of these cell types can be visual- 
ized with the use of gene trap lines and reporter 
constructs (fig. S1). The speed of regeneration 
varies among individuals and correlates with 
the frequency of moulting (fig. S2). Young adults 
typically need 5 to 8 days to regenerate a pat- 
terned thoracic limb, such as the one shown in 
Fig. 1B. 

We used cellular, morphological, and ge- 
netic markers to define a timeline for limb re- 
generation in Parhyale young adults (Fig. 1C). 
Wound closure takes place within a day of am- 
putation, as seen by the development of a mel- 
anized scab at the wound surface (Fig. 1, D 
to D”). This is followed by the formation of a 
blastema, consisting of proliferating cells un- 
derlying the wound, which can be visualized 
by EdU incorporation 2 to 3 days after ampu- 
tation (Fig. 1, E to E”, and fig. S3). Approx- 
imately 4 to 6 days after amputation, the distal 
tip of the newly regenerated appendage be- 
comes apparent, visualized with the Distal?’ 
gene trap (//) (Fig. 1, F to F”, and movie S1). 
During the following days, the regenerating 
limb grows in size, acquiring its characteristic 
pattern of limb segments. The axis of the limb 
is folded (often S-shaped) to accommodate the 
growing appendage in the limited space avail- 
able within the exoskeleton of the amputated 
limb (Fig. 1B). The fully regenerated limb is re- 
vealed during the next moult. Muscles, visualized 
with the PhMS-DsRed reporter (2), regenerate 
after the epidermis, within a week from moulting 
(fig. S1B’). 

To address whether pluripotent or lineally re- 
stricted progenitor cells give rise to the new tis- 
sues of regenerated limbs, we marked individual 
cell lineages in early embryos and followed their 
regenerative contributions in adults. Parhyale 
embryos have a stereotypic early cell lineage: 


At the eight-cell stage, three blastomeres (El, Er, 
and Ep) are fated to produce the ectoderm, three 
(ml, mr, and Mav) give rise to mesoderm, one 
(en) produces cells that localize in the gut, and 
one (g) gives rise to the germ line (/3) (Fig. 2A). 
We stably marked each of these blastomere lin- 
eages by injecting early embryos with a Minos 
transposon carrying a fluorescence marker [en- 
hanced green fluorescent protein (EGFP) or 
DsRed] driven by the PAHS promoter, which is 
activated in all cell types after heat shock (/4). 
Integration of the transposon in the genome of 
individual blastomeres produced mosaic ani- 
mals in which the marked lineages could be 
identified (Fig. 2, A to C). By injecting ~4000 
early embryos and screening the survivors as 
late embryos and juveniles, we identified 79 in- 
dividuals in which specific cell lineages were 
marked. These individuals were raised to adult- 
hood, subjected to limb amputations, and al- 
lowed to regenerate. The contribution of each 
marked lineage was then assessed on the regen- 
erated limbs (Fig. 2, B and C, and supplementary 
materials and methods). Results are summarized 
in Fig. 2D. 

We consistently observed that the descend- 
ants of blastomeres El, Er, and Ep gave rise to 
regenerated ectodermal cell types, namely epi- 
dermis and neurons, but never to mesodermal 
cells, such as muscle (Fig. 2B). Conversely, the 
descendants of blastomeres ml, mr, and Mav 
gave rise to regenerated muscle, but not to epi- 
dermis or neurons (Fig. 2C). We did not observe 
any contribution from the en and g lineages to 
regenerated appendages. We found no blas- 
tomere lineages contributing to both ectoder- 
mal and mesodermal cell types, implying an 
absence of pluripotent progenitors and of trans- 
differentiation across ectoderm and mesoderm. 
These results demonstrate that, in Parhyale, re- 
generative progenitor cells have a developmental 
potential that is restricted with respect to germ 
layers. These progenitors may be stem cells or 
differentiated cells that have retained the capacity 
to proliferate. 

We also found that marked cell lineages only 
contribute to regenerating limbs locally, with- 
in the body region that was originally popu- 
lated by each lineage: El and ml contribute to 
regeneration of appendages on the left side of 
the body, Er and mr to appendages in the right, 
Ep to posterior thoracic appendages, and Mav 
to regenerated antennae (Fig. 2D). The fact that 
no lineage contributed to all limbs suggests that 
there is no central pool of progenitor cells for the 
whole body—the progenitor cells reside locally. 

Our experiments show that Parhyale limb 
regeneration involves at least two distinct types 
of progenitor cells: ectodermal and mesoder- 
mal. These progenitors derive from distinct 
cell lineages, have predetermined (lineally re- 
stricted) regenerative capacities, and are present 
near the regenerating tissue. This is highly remi- 
niscent of vertebrates—in which distinct pro- 
genitors give rise to different ectodermal and 
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mesodermal cells during limb, fin, and tail re- 
generation (4—6)—and differs from the strategy 
used by planarians involving pluripotent stem 
cells (2, 3). 

While studying the PhMS-DsRed reporter 
line, where DsRed is expressed in muscles (/2), 
we noticed another DsRed-expressing cell type 
that is tightly associated with muscles within 
each limb (Fig. 3A). These compact cells re- 
minded us of the satellite cells, which can serve 
as muscle progenitors during muscle mainte- 
nance, growth, and regeneration in vertebrates 
(4, 15-19). Satellite cells are characterized by 
expression of Pax3/7 family transcription fac- 
tors (19, 20). Using two antibodies that recog- 
nize Pax3/7 proteins in a wide range of animals 
(21) and in situ hybridization, we saw that the 
muscle-associated cells of Parhyale also ex- 
press Pax3/7 (Fig. 3, A to C, and fig. S6). The 


expression level is variable but is consistently 
above background in these cells (fig. S7). By 
genetically marking mesodermal cell lineages 
(as described earlier), we could demonstrate that 
these Pax3/7-positive cells have a mesodermal 
origin (Fig. 3D). Based on these characteristics, 
we refer to these mesodermal Pax3/7-positive 
cells as satellite-like cells (SLCs). 

Consistent with a possible involvement of 
SLCs in regeneration, Pax3/7-positive cells pro- 
liferate in the amputated limb stump and con- 
tribute to the regeneration blastema 2 to 3 days 
after amputation (Fig. 3E). To directly test wheth- 
er SLCs can act as muscle precursors during 
regeneration, we transplanted isolated SLCs 
from the limbs of PhMS-EGFP transgenic ani- 
mals into the amputated limbs of wild-type 
animals and examined the contribution of those 
EGFP-marked cells after regeneration (Fig. 4A 


Fig. 1. Limb regeneration in Parhyale. (A) Adult 
Parhyale with amputated antenna, thoracic leg, and 
uropods (arrowheads). Scale bar, 1 mm. (B) Regenerat- 
ing thoracic leg visualized by the Distals%e4 gene trap 
(11) (red), within the cuticle of the amputated limb (green 
autofluorescence). Scale bar, 100 um. (C) Timeline of 
Parhyale leg regeneration, representing the average 
speed of regeneration in ~6-month-old adults. (D to F 
") Different phases of leg regeneration are characterized 
by melanized scab formation at the wound surface 
(arrowhead) (D, D’, D”); proliferating cells in the blastema, 
labeled by EdU (green) (E, E’, E”); and morphogenesis of 
the leg visualized by the Distal?**“ gene trap (F, F’, F”) 


(time-lapse shown in movie S1). hpa, hours postamputa- 


tion. 
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and supplementary materials). The EGFP-marked 
cells expressed Pax3/7 (fig. S7) and were free 
of muscle cells. Out of 72 limbs that received 
marked SLCs, 12 contained one or a few EGFP- 
expressing muscle fibers after regeneration 
(Fig. 4, B to D). To test whether muscle fibers 
could derive from unlabeled cells that were in- 
advertently transplanted together with the EGFP- 
marked SLCs, we also transplanted unlabeled 
(non—EGFP-expressing) cells from PhMS-EGFP 
donors to wild-type recipients; no EGFP-expressing 
muscles were observed after regeneration in 34 
limbs. These results demonstrate that SLCs are 
capable of functioning as progenitor cells for 
regenerated muscle. Our experiments do not ex- 
clude the involvement of other types of muscle 
progenitors, in addition to SLCs. 

Our study has revealed a number of key 
similarities between arthropod and vertebrate 


moult 


wound blastema patterning epidermis, growth muscle 
closure formation neurons formation 
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Fig. 2. Contribution of marked cell lineages to regen- 
erated tissues. (A) At the eight-cell stage, the Parhyale 
embryo has blastomeres contributing specifically to the 
ectoderm (blue), mesoderm (red), gut (gray), and germ 
line (yellow) (73). Images show EGFP-marked lineages 
in the ectoderm (El, ventral view), mesoderm (mr, lateral 
view), germ line (g, dorsal view), and gut (en, dorsal view). 
(B and C) EGFP-marked mosaics showing the contributions 
of the El and mr lineages during regeneration. After ampu- 
tation, the El lineage contributed to epidermis and neu- 
rons (B), whereas the mr lineage contributed to muscles 
(C). Amputation planes are marked by dashed lines; limbs 
are visualized by reflected light in (C) (magenta). (D) 
Summary of lineage contributions to regenerated limb 
tissues (number of marked limbs/number of limbs tested). 
n, number of animals in which the particular blastomere 
lineage was labeled. 


D Regenerated 
tissues Left 


Marked lineages 


Fig. 3. Satellite-like cells (SLCs) in Parhyale. (A) SLCs (arrow- 
heads) in the ischium and merus of a thoracic limb, from a late embryo 
carrying the PhMS-DsRed reporter, stained with antibodies for DsRed 
(red) and Pax3/7 (blue) and with phalloidin to mark muscles (green). 
The PhMS regulatory sequence carries putative MyoD binding sites 
(12). (B) SLC in Parhyale limb stained with an antibody for Pax3/7 
(black) and phalloidin (red). SLC nuclei have a diameter of 5 to 10 um 
and occupy most of the cell’s volume. (C) SLC in an adult Parhyale limb 
sectioned transversely and stained with an antibody for Pax3/7 (red), 
4',6-diamidino-2-phenylindole (DAPI) (blue), and phalloidin (green). 
SLC nuclei are associated with muscle fibers in late embryos, juveniles, 
and adults. (D) SLCs have a mesodermal origin, seen by the colocaliza- 
tion of Pax3/7-positive nuclei (blue) with a marker for mesoderm (arrow- 
head) (seven out of seven SLCs scored in late embryos). Mesodermal cells 
were clonally marked with nuclear-localized EGFP (green) and membrane- 
localized tdTomato (red). (E) Pax3/7-expressing cells (red) contribute to the 
regeneration blastema (EdU, green). The amputation plane was at the 
edge of the blastema, on the right. Four Pax3/7-positive cells are seen in 
the proximal part of the blastema, three of which are positive for EdU (on 
average, 2.5 Pax3/7-positive cells per blastema, scored on 12 blastemas). 
Individual fluorescence channels for these images are shown in fig. $4. 
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limb regeneration. First, regenerated ectoder- 
mal and mesodermal cells derive from distinct 
progenitor cells, rather than a common pool of 
pluripotent progenitors. Second, the progeni- 
tors are present locally in the amputated limb 
stump. Third, muscle regeneration involves a 
similar progenitor cell type, the satellite-like 
cells, which are tightly associated with mus- 
cles before regeneration and contribute to new- 
ly formed muscle fibers. We suggest that these 
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similarities reflect cell types and repair strat- 
egies that evolved in the common ancestors of 
protostomes and deuterostomes in Precambri- 
an times. 


References and Notes 

1. E. M. Tanaka, P. W. Reddien, Dev. Cell 21, 172-185 (2011). 

2. P. W. Reddien, A. Sanchez Alvarado, Annu. Rev. Cell Dev. 
Biol. 20, 725-757 (2004). 

3. D. E. Wagner, I. E. Wang, P. W. Reddien, Science 332, 
811-816 (2011). 


Posterior 


0/10 


1 


0/22 
0/16 


Left 
0/20 
0/19 
0/14 
10/11 
0/14 
0/11 
0/25 
0/18 


O/7 


0/10 
0/8 
0/9 
0/8 


mr (EGFP) 


Mesoderm 


Right 
O19 
O18 
0/20 
Ort 

11/11 
0/10 
0/25 
o/4 


Anterior 
0/23 
0/21 
0/21 
0/10 

0/8 
12/12 
0/34 
0/21 


. C. Gargioli, J. M. Slack, Development 131, 2669-2679 


(2004). 


. M. Kragl et al., Nature 460, 60-65 (2009). 
. 5. Tu, S. L. Johnson, Dev. Cell 20, 725-732 (2011). 
. Y. Rinkevich, P. Lindau, H. Ueno, M. T. Longaker, 
I. L. Weissman, Nature 476, 409-413 (2011). 
. H.R. Bode, j. Cell Sci. 109, 1155-1164 (1996). 
. W. A. Miller, R. Teo, U. Frank, Dev. Biol. 275, 215-224 


(2004). 


. T. C. G. Bosch, Bioessays 31, 478-486 (2009). 
. Z. Kontarakis et al., Development 138, 2625-2630 


(2011). 


14 FEBRUARY 2014 VOL 343 SCIENCE www.sciencemag.org 


Fig. 4. Transplanted A 
SLCs contribute to mus- 
cle regeneration. (A) 
Schematic representation 
of the transplantation ex- 
periment designed to test 
the contribution of SLCs to 
regenerated muscle. EGFP- 
marked SLCs were taken 
from the limbs of trans- 
genic donors carrying the 
PhMS-EGFP construct (12) 
and transplanted into 
freshly amputated limbs 
of nontransgenic recipi- 
ents. In control experiments, 
non—EGFP-expressing cells 
were transplanted using 
the same donor and recip- 
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cence (arrowhead, green) 
in recipient limbs also 
stained with phalloidin 
(red) and DAPI (blue). 
ex, autofluorescence of 
the chitinous exoskel- 
eton. Scale bar, 50 um. 


(C) Higher-magnification view of EGFP-expressing muscle. Scale bar, 10 um. 
(D) Transverse section of a regenerated recipient limb, showing an EGFP- 
expressing muscle fiber (arrowhead), stained with an antibody for EGFP 
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Somites Without a 


Clock 


Ana S. Dias,’* Irene de Almeida,’* Julio M. Belmonte,” James A. Glazier,” Claudio D. Stern*t 


The formation of body segments (somites) in vertebrate embryos is accompanied by molecular 
oscillations (segmentation clock). Interaction of this oscillator with a wave traveling along 

the body axis (the clock-and-wavefront model) is generally believed to control somite number, size, 
and axial identity. Here we show that a clock-and-wavefront mechanism is unnecessary for somite 
formation. Non-somite mesoderm treated with Noggin generates many somites that form 
simultaneously, without cyclic expression of Notch-pathway genes, yet have normal size, shape, 
and fate. These somites have axial identity: The Hox code is fixed independently of somite fate. 
However, these somites are not subdivided into rostral and caudal halves, which is necessary for 
neural segmentation. We propose that somites are self-organizing structures whose size and shape 


is controlled by local cell-cell interactions. 


the cardiovascular and musculoskeletal 


T: mesoderm of the embryo, from which 
systems arise, derives from the primitive 
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streak (PS) during gastrulation. A high level of 
bone morphogenetic protein (BMP) at the pos- 
terior PS generates ventral mesoderm (blood 
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vessels, lateral and extraembryonic mesoderm), 
whereas lower levels near the anterior tip gen- 
erate paraxial mesoderm, from which somites 
(future striated muscle and axial skeleton) de- 
velop (/). Somites are epithelial spheres that 
form sequentially from head to tail on either 
side of the spinal cord. The combination of a mo- 
lecular clock (cell-autonomous Notch and Wnt 
oscillations) and a wave traveling the length of 
the paraxial mesoderm (2, 3) is thought to reg- 
ulate the number, size, timing of formation, and 
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Fig. 1. BMP inhibition generates normal somites. (A to E) Exper- 
imental design. The PS of a donor quail or GFP-transgenic embryo 
is excised; exposed to Noggin; and grafted, surrounded by Noggin- 
beads, to the periphery of a host chick embryo [(A and B), arrows]. 
After overnight incubation, a group of somite-like structures—arranged 
as a bunch of grapes—appears [(C and D), arrows]. These structures 
fluoresce if the donor is a GFP-transgenic embryo (E). (F to P) The 
ectopic structures are real somites: They express paraxis (F and G) and 
N-cadherin [green in (H) to (J)] and are surrounded by a Fibronectin 
eee eate ata matrix [red in (H) to (J)]. Multiphoton confocal sections through nor- 

mal (I) and ectopic (J) somites were used to estimate somite sizes (K). 
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When an ectopic somite is grafted instead of a somite in an older embryo (L), the graft incorporates well (M). After 2 to 3 days, the grafted somite 


appropriately expresses MyoD (N to P). 


Fig. 2. Ectopic somites form without cyclic ex- 
pression of segmentation clock genes. Embryos 
were fixed at 45-min intervals (examples shown 
at 3, 5.15, 6.45, and 7.5 hours after grafting to a 
host embryo) and stained for expression of Hairy1 
(A to D), Hairy2 (E to H), and LFng (I to L). The in 
situ embryos were developed to reveal the seg- 
mentation clock in the presomitic cells of the host. 
Although patterns of expression in the presomitic meso- 
derm of the host are dynamic, no major differences 
in expression are seen in the graft (insets). Arrows 
mark the graft region, which is shown magnified in 
the insets. 
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axial identity (4-6) of somites. Because the BMP 
antagonist Noggin is sufficient to transform ven- 
tral cells to a dorsal (somite) fate (7, 8), we applied 
Noggin as evenly as possible (9) to dorsalize pos- 
terior PS explants from quail or green fluores- 
cent protein (GFP)transgenic chick embryos and 
thus to test whether somites could be generated 
independently of a segmentation clock (0, //). 
Explants from stage-5 (/2) embryos were in- 
cubated in Noggin for 3 hours, then grafted into a 
remote (extraembryonic) region of a host chick 
embryo surrounded by Noggin-soaked beads 
(Fig. 1, A and B). A few hours later (total 9 to 12 
hours), 6 to 14 somite-like structures had formed, 
arranged as a “bunch of grapes” (Fig. 1, C to E) 
rather than in linear sequence. Like normal som- 
ites, these structures express paraxis (8) (Fig. 1, 
F and G) and consist of epithelial cells around a 
lumen (Fig. 1, G to J), with apical N-cadherin 
and a Fibronectin-positive basal lamina (Fig. 1, 
H to J). The size of each somite-like structure is 
normal: Fig. 1K compares ectopic and normal 
somite volumes calculated from living embryos 
and multiphoton cross-sectional areas with and 
without the lumen (¢ tests P = 0.496, 0.401, and 
0.493, respectively). 

To test whether the ectopic somites can give 
rise to normal somite derivatives, we replaced in- 
dividual recently formed somites in 10 to 14 som- 
ite secondary hosts with ectopic GFP-transgenic 
somites (Fig. 1L). After 2 to 3 days (stages 19 
to 25), the grafted somite was well integrated 
(Fig. 1M) and expressed the sclerotome/vertebral 
marker Pax/ (fig. S1) (7 = 6 experiments) and 
the dermomyotome/muscle marker MyoD (Fig. 1, 
N to P) (7 = 4) in the correct positions. Some 
blood vessels were also generated (fig. S1), which 
may be normal somite derivatives (13, 14) or 
cells retaining their original lateral fate. Thus, 
the structures in the “bunch of grapes” are indeed 
somites. 


Fig. 3. Ectopic somites are 
not subdivided into rostral 
and caudal halves. (A to F) 
Ectopic somites were ana- 
lyzed for expression of caudal 
(Hairy1, Meso2, LFng, Hairy2, 
and Uncx4.1) and rostral 
(EphA4) markers. Hairy1 (A), 
Meso2 (B), and EphA4 (C) are 
not expressed; LFng and Hairy2 
(D and E) are weak and uni- 
form; and Uncx4.1 is expressed 
as random patches (F). Insets 
show a magnified view of the 
graft. (G to O) As a further test 
of rostrocaudal patterning, 
embryos grafted as in Fig. 1L 
were stained for motor axons 
[neurofilament-associated pro- 


tein NAP, (G to I), brown] or neural crest aay (J to 0), brown] and anti-GFP [green in (I), (L), and 
(0)]. A large gap (G to 1), fused roots (J to L), or multiple small ganglia (M to O) form in the 
ectopic somite (arrows, asterisks). Sections (I) and (L) are coronal, (O) is transverse at the level of the graft. 


www.sciencemag.org SCIENCE VOL 343 


To test whether somites form sequentially or 
simultaneously, we used time-lapse microscopy 
to film ectopic GFP-transgenic somite formation. 
About 6 to 14 somites form in just 2 hours (9 to 
11 hours after grafting) (fig. S2 and movies S1 
and $2). The finding that so many somites can 
form almost synchronously suggests that the 
ectopic somites form independently of a clock. 
To assess the molecular clock, we examined 
embryos at different time points before ectopic 
somite formation for expression of clock genes 
Hairy! (Fig. 2, A to D), Hairy2 (Fig. 2, E to H), 
and LF ng (Fig. 2, I to L) at 45-min intervals 
between 3 and 7.5 hours after exposure of PS 
explants to Noggin. Although host embryos dis- 
played typical (/0) strong variations in the pattern 
of expression, the explants showed only subtle 
variations, not like a prepattern of the somites 
that would later form. Moreover, when examining 
many embryos for each marker at a particular 
time point (fig. S3), oscillatory expression was 
evident in the host embryo, but the explants 
(insets) show comparatively uniform expression. 
Examination of Dapper! and -2 expression sug- 
gests that Noggin-treated mesoderm can generate 
somites without passing through a presomitic- 
like state (fig. S4). These results strongly suggest 
that the ectopic somites form simultaneously and 
without cyclic expression of clock genes. 

Each somite is normally subdivided into two 
halves, rostral and caudal, a property subsequent- 
ly required for segmentation of the peripheral 
nervous system (/5). To test whether the ectopic 
somites are subdivided, we examined expression 
of caudal (Hairy1, Hairy2, LFng, Uncx4.1, and 
Meso2) and rostral (EphA4) markers. None of 
them revealed subdivision of the ectopic somites. 
Hairy1 [0 of 22 embryos (0/22)], Meso2 (0/22), 
and EphA4 (0/19) were not expressed (Fig. 3, A 
to C); LFng (22/24) (Fig. 3D) and Hairy2 (8/8) 
(Fig. 3E) were expressed weakly and uniform- 
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ly throughout the somites; and Uncx4./ (13/19) 
(Fig. 3F) was patchy (Fig. 3F). Therefore, ectopic 
somites seem to lack coherent rostrocaudal iden- 
tity, because the patterns of different genes are 
inconsistent with each other. As neural crest cells 
and motor axons normally only migrate through 
the rostral half of the sclerotome (/5), we used 
this as an additional test of somite patterning. An 
ectopic GFP-somite was grafted instead of a nor- 
mal somite in a secondary host (Fig. 1L). At stage 
22 to 25, the patterns of motor axon growth 
(Fig. 3, G to I) and neural crest migration (Fig. 3, 
J to O) were disrupted. Abnormalities included 
an enlarged gap between motor roots (Fig. 3, G 
to I), fusion of adjacent ventral roots and dorsal 
root ganglia (Fig. 3, J to L), or several small 
ganglia formed within a grafted somite (Fig. 3, 
M to O), as if the somite contained random 
islands of permissive (non-caudal) cells exploited 
by axons and crest cells. These results suggest 
that the ectopic somites are not subdivided into 
rostral and caudal halves, consistent with the 
proposal (/6) that the clock is required for this 
feature of segmentation. 

During normal development, the occipital 
somites (the most cranial four or five somites) 
form almost simultaneously rather than in se- 
quence (movie S3) and lack expression of some 
rostral and/or caudal markers (/ 7-19). Could the 
ectopic somites be occipital? We examined ex- 
pression of Hox genes (20, 2/) (Fig. 4, A to P): 
Hoxb3 (Fig. 4, A and C) and Hoxb4 (Fig. 4, E 
and G) were both expressed (Fig. 4, B, D, F, 
and H), suggesting that they are not occipital. 
Hoxb6 and Hoxb9 were not expressed (Fig. 4, 
F and J), suggesting that they are cervical (som- 
ite eight or nine). The posterior PS of stage-5 
donor embryos expresses similar genes: Hoxb3 
and b4, but not b6 or b9 (Fig. 4, A, E, I, and M); 
the latter start to be expressed later (stage 7 or 
8) (Fig. 4, C, G, K, and O). We therefore tested 


14 FEBRUARY 2014 


REPORTS 


794 


whether somites made from PSs from older em- 
bryos (stage 8) express these markers. Indeed, 
they do (Fig. 4, L and P). This confirms that the 
Hox code imparting axial identity to cells is al- 
ready present in the PS (22), independently of 
the segmentation clock (6), and suggests that 
the axial identity of the ectopic somites is spe- 
cified according to which Hox genes are expressed 
in the posterior PS at the time of explantation, 
even though this region does not normally con- 
tribute to somites. Therefore, either exit of cells 
from the PS or, more likely, inhibition of BMP 
by Noggin arrests the molecular clock control- 
ling expression of Hox genes that impart axial 
identity (23). In vivo, this may happen as pre- 
somitic cells leave the BMP-expressing PS and 
lie next to the notochord, the endogenous source 
of Noggin. 


The clock-and-wavefront model requires both 
an oscillator and a wave. In zebrafish, changing 
the period of molecular oscillations affects somite 
number and size (5, 6). We show that somites can 
form without oscillations of segmentation clock 
genes; all of their properties are normal, except 
for their rostrocaudal subdivision. Moreover, waves 
and gradients are also unnecessary, because the 
spatial organization and simultaneous formation 
of the ectopic somites does not seem compatible 
with this. We therefore propose that the main 
function of the clock is to subdivide somites into 
rostral and caudal halves and to couple this to 
somite formation. 

If clock-and-wavefront mechanisms are 
not required to control somite formation, what 
does? Our observations implicate local cell-cell 
interactions. Embryological experiments (24) 


Fig. 4. Ectopic somites have trunk identity, fixed according to the Hox genes expressed in the 
donor PS. (A to P) At stage 5, the posterior PS expresses Hoxb3 (A) and b4 (E), but not bé (1) or b9 (M). 
Ectopic somites made from posterior streak explants from these stages show a similar pattern of ex- 
pression (B, F, J, and N). At stages 7 and 8, the posterior streak expresses all four genes (C, G, K, and O), as 
do the ectopic somites formed from it (D, H, L, and P). Arrows point to the graft region, shown magnified 


in the insets. 


suggest that somites are self-organizing struc- 
tures, regulated by intrinsic properties of the cells 
and packing constraints for cells undergoing 
mesenchymal-to-epithelial conversion as they 
form spheres. We tested this in computer sim- 
ulations using CompuCell-3D (25, 26), with the 
following assumptions: (1) A cell mass is exposed 
to Noggin evenly and simultaneously; (ii) in re- 
sponse, cells polarize and elongate; (ii1) polarized 
cells secrete extracellular matrix; (iv) polarized 
cells have to be exposed to extracellular space 
at both their apical and basal surfaces; (v) tight 
junctions form at the apical ends; and (vi) mis- 
placed cells rearrange their polarity and attach 
to their appropriate ends (27). This causes cells 
to become arranged in spherical masses around 
a lumen (movie S4) (9). After a transition pe- 
riod of intense cell rearrangement, the somites 
stabilize. The number of cells they contain is 
relatively invariant, and their structure is sim- 
ilar to that seen in vivo. There is no tendency to 
merge into a giant structure, nor do very small 
stable somites form. We propose that somite size 
and shape can be controlled entirely by local 
cell interactions, such as adhesion and packing 
constraints of cells transitioning between the 
mesenchyme and a polarized epithelium (28). 
Inhibition of BMP by Noggin may be a trig- 
ger for this conversion, consistent with the ab- 
normal somite formation in Noggin-null mice 
(29), and may also “freeze” molecular determi- 
nants of axial identity (Hox code). In normal 
embryos, the segmentation clock and associated 
wave are likely to play a role in regulating the 
timing of somite formation and coupling this to 
the subdivision of each somite into rostral and 
caudal subcompartments. 
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An Antifreeze Protein Folds with an 
Interior Network of More Than 400 
Semi-Clathrate Waters 


Tianjun Sun, Feng-Hsu Lin, Robert L. Campbell, John S. Allingham, Peter L. Davies* 


When polypeptide chains fold into a protein, hydrophobic groups are compacted in the center 
with exclusion of water. We report the crystal structure of an alanine-rich antifreeze protein 

that retains ~ 400 waters in its core. The putative ice-binding residues of this dimeric, four-helix 
bundle protein point inwards and coordinate the interior waters into two intersecting polypentagonal 
networks. The bundle makes minimal protein contacts between helices, but is stabilized by 
anchoring to the semi-clathrate water monolayers through backbone carbonyl groups in the 
protein interior. The ordered waters extend outwards to the protein surface and likely are involved 
in ice binding. This protein fold supports both the anchored-clathrate water mechanism of 
antifreeze protein adsorption to ice and the water-expulsion mechanism of protein folding. 


driving force for protein folding is for- 
A mation of a hydrophobic core (/, 2). As 

aliphatic and aromatic side chains pack 
together in the protein core, they release con- 
strained waters into the surrounding solvent with 
an overall gain in entropy that helps power the 
folding process. Therefore, almost all globular 
proteins reported to date have a dry protein core. 
There are two main schools of thought for how 
a protein’s hydrophobic core is formed. In the 
dewetting mechanism, waters collectively evap- 
orate from the partially formed core. This is fol- 
lowed by the spontaneous collapse of the core, 
which stabilizes the protein by reducing the 
solvent-accessible surface area of core residues 
(3). In the expulsion mechanism, an initial struc- 
tural collapse forms a near-native intermedi- 
ate with a partially solvated hydrophobic core. 
This is followed by water expulsion from the 
hydrophobic core to form the native state (4). 
In this model, waters are thought to function 
as a lubricant that helps the hydrophobic core 
find its optimally packed state (4). Here we re- 
port the crystal structure of an antifreeze pro- 
tein (AFP) with a water-rich core that offers 
an alternative view on the role of water in pro- 
tein folding. 
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Antifreeze proteins are a class of proteins that 
adsorb to the surface of ice crystals to prevent 
their growth. We determined the crystal structure 
of the AFP Maxi, a large isoform of the 3-kD, 
alanine-rich type I AFP from winter flounder 
(Pseudopleuronectes americanus). Type I AFP is 
a monomeric a helix with an 11-residue perio- 
dicity (5) that binds to a pyramidal ice plane (6). 
Its ice-binding residues (Thr, i+4 Ala, i+8 Ala) 
are arrayed along one face of the helix (7). Ice- 
binding residues are thought to organize and sta- 
bilize ordered waters to merge with, and freeze 
to, the quasi-liquid layer of water next to the ice 
lattice (8—/0). Maxi is five times as long as type I 
and forms a homodimer with no stable monomer 
form, but is otherwise similar in alanine richness 
(65%), helicity (>95% oa helix), and 11-residue 
periodicity (1/—13). 

The crystal structure of Maxi was determined 
at 1.8 A resolution from needle-shaped crystals 
(table S1) grown in arginine buffer at pH 9.6. Maxi 
is a rodlike four-helix bundle with a length of 
145 A and an average diameter of 22 A (Fig. 1A). 
Both 290 A-long helix monomers fold exactly 
in the middle through 180° so that their N and C 
termini are side by side. In the dimer, the two hair- 
pins are aligned in an antiparallel manner with- 
out overlap such that the four-helix bundle has a 
twofold rotational axis of symmetry (indicated 
by the central curved arrow). The two N-terminal 
helices lie adjacent to each other in an antiparallel 


orientation, as do the two C-terminal helices. At 
the secondary-structure level, Maxi is composed 
of tandem 11-residue repeats (T/IxxxAxxxAxx, 
where x is any residue) that each form three helical 
turns with an average of 3.7 residues per turn 
(R1-3 and R1’-3’), as opposed to 3.6 residues per 
turn in the classic a helix. The only departures 
from this pattern are the central sections of two 
seven-residue segments (two helical turns each) 
and the terminal capping regions. The capping 
regions comprise the N and C termini of one 
monomer and the hairpin loop region of the other 
(Fig 1B). Notably, the two chains of Maxi as- 
sociate with minimal protein-protein interactions, 
as do the two arms of the hairpins (Fig. 1C). This 
is in sharp contrast to a standard four-helix bundle 
protein like “Repressor of primer” (Rop), where 
the longer aliphatic and aromatic side chains form 
a compact hydrophobic core with only two waters 
inside (Fig. 1D). Rop was selected for comparison 
because, like Maxi, it is also a dimer of hairpin o 
helices with a nonoverlapping antiparallel align- 
ment (/4). 

Direct contact between the Maxi monomers is 
largely limited to the capping structures and the 
center region. In the capping structures, Leu” 
and Tyr!” near the hairpin loop of one monomer, 
together with Ile” at the N terminus and Phe!”! at 
the C terminus of the other monomer, pack to- 
gether to form a local hydrophobic core with 
stacking of the two aromatic side chains (Fig. 
1B). On the hairpin loop, Lys!” donates hydro- 
gen bonds to cap the backbone carbonyl groups 
of Ala!?!, Ala!**, and Ala!™ on the C terminus, 
whereas the carbonyl groups of Leu” and Ie” 
and the side chain of Asn”® accept hydrogen 
bonds from Arg’. In the center region, Ile?! and 
Ile** from both N-terminal helical arms of Maxi 
form a hydrophobic cluster, as does Val'** and 
Val'“® from both C-terminal helical arms. How- 
ever, the intervening helical repeats pack loosely 
through sparse van der Waals interactions, as shown 
by a representative cross section through one repeat 
(Fig. 1C). 

The apparent loose packing of the four helices 
in the 11-residue repeat regions generates inter- 
nal space that is just wide enough to accommo- 
date a single layer of water (Fig. 1C and Fig. 2, 
A and B). The water molecules that occupy the 
gap form an extensive polypentagonal network 
around the inner-projecting residues (mostly Ala 
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and Thr, fig. S1), with an occasional tetragonal 
and hexagonal water ring. In the 11-residue repeat 
regions, the network comprises two monolayers 
of polypentagonal waters (Fig. 2, A and B) that 
cross each other at ~90° (Fig. 3 and fig. S3). The 
intrachain sheet, which lies between the N-terminal 
arms and the C-terminal arms of Maxi, contains 


19 pentagonal rings per 11-residue repeat (labeled 
A to R in Fig. 2A). The pattern for each repeat 
is almost identical, and three repeats spanning 
nine helical turns are shown. Occasional large 
holes in the sheet are generated where side chains 
like Thr'** and Ala®* from the same monomer 
form close protein-protein contacts. The inter- 


Fig. 1. Four-helix bundle structure of Maxi has an 
open core. (A) Four-helix bundle structure of Maxi in 
ribbon format, showing how the antiparallel hairpin 
monomers 1 and 2 align. N and C termini are indicated. 
The dyad axis of symmetry is shown by the arrow Cp. 
Capping regions, 11-residue repeats, and center sections 
are labeled as Cap, R, and Center, respectively. Residues 
in van der Waals contact in the cap and central regions 
are indicated by their side chains. (B) End-on view of the 
capping structure of Maxi; hydrogen bonds are illustrated 
by black dotted lines. (C) End-on view of a cross section of 
Maxi in surface representation marked in (A) by the red 
box. Red spheres are crystallographic waters present in the 
20 A—deep section. (D) Similar-sized section through Rop, 
a representative four-helix bundle structure. 


Fig. 2. The interior water structure associated with the 
repeat regions of Maxi. (A) Sheet of pentagonal intrachain 
water rings labeled A to R. Waters are showed as red spheres 
and hydrogen bonds by black dashes. Residues forming intra- 
chain contacts are identified and shown in stick representa- 
tion. Waters that hydrogen bond to the carbonyl groups of 
Maxi in R3 are numbered. (B) A sheet of interchain pentagonal 
waters at right angles to the sheet in (A), as indicated in the 
schematic diagrams on the right. Notations are the same as in 
(A). The pentagonal rings in this sheet are labeled from S to Z. 
In addition, rings G, H, and K at the intersection of two sheets 
are also labeled. 


Center Center 


chain water web that forms the interface between 
the two monomers contains seven pentagonal rings 
(labeled S to Z in Fig. 2B). Again, a few large 
holes appear in this sheet due to rare interchain 
side-chain contacts like those between Thr!” and 
Ala’? from different monomers. A molecular 
dynamics simulation of Maxi in a box of water 
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showed that the predicted interior water densities 
match extremely well with the ones in the protein 
core of the crystal structure (fig. S2). 

Fifty years ago, Scheraga et al. theorized that 
a partial cage of pentagonal water rings will form 
when hydrophobic side chains in a helices or B 
sheets are separated by a single layer of water 
molecules (/5). The formation of pentagonal rings 
has negative free energy when two hydrated ali- 
phatic side chains approach each other during 
protein folding to a distance where they are sep- 
arated by a single water layer. This proposal has 
been borne out in the present work, where the 
spaces inside Maxi’s four-helix bundle are just 
wide enough to allow a single layer of waters to 
fit in and form a semi-clathrate structure. Before 
this study, five pentagonal “ice-like” water clusters 


Fig. 3. Interior water network in Maxi. 
Section through Maxi showing network 
of interior waters as spheres colored by 
association with the yellow or orange 
monomers. Hydrogen bonds are indi- 
cated by black dotted lines. Waters that 
make multiple matches to the ice lattice 
and are not sterically hindered by the 
protein are boxed in red. e 


Fig. 4. Water interactions 
with Maxi. (A) Cross section 
through R2, illustrating the 
interior water network and 
the intersection of the intra- 
and intermolecular water sheets. 
Waters are shown as red spheres, 
and hydrogen bonds are shown 
as green dotted lines. (B) A 
section of helix from Maxi in 
stick format showing bifurca- 
tion of intramolecular hydro- 
gen bonds. Hydrogen bonds 
between water (red spheres) 
and carbonyl O atoms are shown 
as green dotted lines; intrahelical 
hydrogen bonds are shown 
as black dashed lines. N atoms 
are blue and O atoms, red. (C) 
A section of helix from Rop in 
stick format showing normal 
intramolecular hydrogen bonds. 
(D) An example of semi-clathrate 
hydrate structure in the inter- 
nal space is shown as a close- 
up view of the green boxed region in (A). 
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had been reported in crystals of crambin [Protein 
Data Bank (PDB) code: 1CRN], a 46-residue seed 
storage protein, but only as a result of intermo- 
lecular packing (/6). It was predicted that these 
localized pentagonal clusters might occur not 
only at intermolecular hydrophobic contacts, 
but also around adjacent, hydrophobic side chains 
in @ helices or B sheets. This prediction has been 
supported by a molecular dynamics simulation 
study of streptavidin (77), where a five-membered 
water ring was formed in between two hydro- 
phobic groups in the binding cavity. The crystal 
structure of Maxi has revealed the natural asso- 
ciation of pentagonal water clusters within a pro- 
tein on a large scale. 

Two-dimensional phases of water are produced 
by nanoscale confinement between nonpolar ma- 
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terials (nanopores) (78). These thin water layers 
exhibit unusual properties compared to bulk wa- 
ter and are typically studied by molecular simu- 
lation. Maxi contains a monolayer of amorphous 
ice/water between mainly hydrophobic surfaces. 
The pattern of the water sheet in Maxt is different 
from those observed by molecular simulations 
and experimental approaches (19, 20). This is 
mainly due to the inner-surface architecture of 
the protein. Because most nanopores have flat 
surfaces, water molecules inside tend to arrange 
themselves in layers parallel to the surface (2/). 
By contrast, the inward-pointing side chains that 
form the inner surface of Maxi are molecularly 
rough, so the five-membered water rings form 
cages on individual residues, illustrating their 
semi-clathrate hydrate structure (Fig. 4D). In ad- 
dition, unlike most nanopores used for simula- 
tions, the inner surface of Maxi interacts with 
about one in four of the waters through hydrogen 
bonds, in addition to van der Waals interactions 
(as described below). 

The 3.7-residue repeat of the helices in Maxi 
results in altered ® and ‘¥ angles compared to 
those of classic a helices. This deviation likely 
causes the backbone carbonyl groups to be slight- 
ly tilted outwards (/3), allowing them to form the 
key intrahelical hydrogen bonds while also hydro- 
gen bonding with solvent waters (Fig. 4B), a 
duality not seen in Rop and other typical a helices 
(Fig. 4C). In addition, Maxi is an alanine-rich pro- 
tein, and these small side chains make the backbone 
carbonyl groups more exposed to solvent. There- 
fore, most of the carbonyl groups in Maxi are 
involved in hydrogen-bonding interactions with 
waters, which helps to keep this rather hydro- 
phobic protein highly solvated and freely soluble 
in flounder blood. For typical protein structures 
determined at 2.0 A resolution, the number of 
water molecules located by crystallography is 
roughly equal to the number of amino acid resi- 
dues (22). By contrast, in the asymmetric unit of 
Maxi, the ratio of water molecules to amino acid 
residues is almost three times this value (2.9), 
even though Maxi does not contain many polar 
residues to hydrogen bond with waters. Because 
most of the residues in the internal space are also 
alanine, hydration occurs around all four helices. 
Maxi uses the interior bifurcated carbonyl groups 
to help anchor the more than 400 internal ordered 
waters, ~25% of which are directly involved in 
backbone hydrogen-bonding interactions. The 
hydrogen-bonding interactions in R3 are listed 
in table S2. 

The hydrated protein core of Maxi suggests 
that this protein uses the water-expulsion fold- 
ing mechanism but does not complete the water- 
discharge step. The alanine richness of the protein 
core may have provided an ideal opportunity 
to see retention of waters during the folding. 
These waters serve to glue the four helical arms 
of Maxi together through a stabilizing network 
of hydrogen bonds that are anchored to back- 
bone carbonyl! groups in the interior (23). Be- 
cause water-mediated protein association is less 
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stable than direct protein association (4), this could 
explain why this antifreeze protein is thermola- 
bile and irreversibly denatures at temperatures 
above 16°C. 

In addition to the internal waters, a second 
unexpected feature of Maxi’s structure was that 
the putative ice-binding residues (Thr/Ile, i+4 
Ala, i+8 Ala) occur on the inward-pointing sur- 
faces of all four helices (colored pink in fig. S1). 
These residues were identified by homology to 
the small, monomeric type I AFP isoform that 
binds ice through this highly conserved hydro- 
phobic face using regularly spaced methyl] groups 
from the Ala and Thr/Ile (7). The same ice-binding 
residues are well conserved in each 11-residue 
repeat of Maxi and occupy equivalent quadrants 
of the helices throughout the molecule. Their side 
chains have a function similar to that of ice-binding 
residues because they cooperate to form and anchor 
the interior ordered waters that stabilize the protein 
structure (Fig. 4A and table S2). The average 
B-factor for all waters in the crystallographic 
asymmetric unit is 26 A*. However, most of the 
waters in the inner network have B-factors that 
are lower than 20 A? (colored blue and green in 
fig. S4). Outer waters (colored red) are more 
disordered. 

Three lines of evidence suggest that the crys- 
tal structure of Max1 is representative of the solu- 
tion form and that the protein does not undergo 
a conformational change to bind to ice crystals 
through the Thr/Ile, +4 Ala, i+8 Ala residues. 
First, Olijve et al. have determined the solution 
structure of Maxi at low temperature using small- 
angle x-ray scattering (24). In solution, Maxi has 
a cylindrical shape and dimensions that are con- 
sistent with the four-helix bundle crystal structure 
and eliminate a previously proposed model where 
the two helices form a fully extended helix dimer 
(13). Second, the measurement of intrinsic flu- 
orescence transfer between Tyr and Phe sug- 
gests that the exquisite capping structure seen in 
the crystal (Fig. 1B) is also present in solution 
(fig. S3). Lastly, cross-linking experiments show 
that the juxtaposition of residues on neighboring 


chains of the helix bundle is the same in solution 
as it is in the crystal (figs. S4 and $5). Moreover, 
where these cross-links prevent the opening of the 
helix bundle to expose the “‘ice-binding residues,” 
there is no appreciable loss of antifreeze activity 
(table S3). 

How then does Maxi bind to ice? Close ex- 
amination of Maxi’s crystal structure shows po- 
sitioned waters extending outwards between all 
four helices from the core to the surface. At the 
periphery, they form a network of ordered waters 
that are unobstructed by the helices and available 
to merge and freeze with the quasi-liquid layer on 
the surface of ice (8—/0). It is possible to fit 
clusters of crystallographic surface waters at the 
face formed by the N- and C-terminal helices 
(boxed regions in Fig. 3) into the ice lattice on 
numerous planes, three of which are shown in 
fig. S6, A to C. Consistent with this result, when 
we used fluorescence-based ice plane affinity anal- 
ysis (25) to determine which ice planes adsorb 
Maxi, all surfaces of the single ice-crystal hemi- 
sphere were bound by the fluorescently tagged 
AFP (fig. S7D). Unlike other AFPs character- 
ized to date, which have ice-binding residues 
located on their surface, Maxi is the only one in 
which these residues are buried inside. This fur- 
ther supports the anchored-clathrate water mech- 
anism by which AFPs adsorb to ice (/0) because 
it suggests that this AFP cannot directly bind to 
its ligand but must do so through the ordered 
surface waters. 
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The Sequencing Continuum 
for Clinical Research: 


From Sanger to Next Gen 


Researchers today have a many choices when deciding which sequenc- 

ing technology to use for their clinical research. Sanger sequencing, long 

the “gold standard” in clinical research sequencing technology, has met 

significant competition with the advent of next generation sequencing 

Speakers (NGS). NGS technologies are now commonplace in clinical research 

laboratories where they have enabled rapid advances in the gathering 

7) Miguel E. Quifiones-Mateu, Ph.D. —_ and analysis of genetic information. However, with these advances have 

UHCMC/CWRU come additional challenges involving validation as these technologies 

become more widespread and move closer to future clinical application. 

In this webinar, our expert speakers will discuss the relative benefits of 

Sanger and NGS technologies and their application in different fields of 
clinical research. 


Cleveland, OH 


Second speaker to be announced. 
During the webinar, the speakers will: 


* Compare and contrast Sanger sequencing and NGS approaches 


* Describe the most relevant applications for both Sanger 
sequencing and NGS 


- Discuss future clinical research applications of both Sanger 
sequencing and NGS 


* Answer your questions live! 


Brought to you by the 
Webinar sponsored by Science/AAAS Custom 
Publishing Office 


ion torrent 
by hife technologies” AVAAAS 


ODEEP-IMAGING MULTIPHOTON SYSTEM 


The new FluoView FVMPE-RS is a dedicated multiphoton microscope system, which enables 
high-precision, ultrafast scanning and stimulation. This system allows researchers to see deep 
within specimens, take measurements at the highest speeds, and capture images, even when 
working under the most demanding conditions. With its high-speed and high-precision perfor- 
mance, the FVMPE-RS is designed for electrophysiology and optogenetics studies. It is also a 
good match for applications such as high-speed calcium and in vivo imaging, peristalsis and blood 
flow studies, mosaic imaging, connectomics and functional brain imaging, stem cell research, 
and any field that requires precise co-localization, uncaging, simultaneous imaging/stimulation, 
extensive real-time signal processing, or multipoint mapping. Its design offers ready adaptability 
for researchers who design their own custom-built systems as well. The precision timing on the 


system allows for microsecond repeatability and control of multiple imaging and stimulation 
protocols as well as millisecond repeatability over days of time-lapse imaging. 


Olympus 
For info: +49-40-23773-5913 | www.microscopy.olympus.eu 
=i] 


SEQUENCING PANEL 

The TruSight One Sequencing Panel is the industry’s broadest sequencing 
panel, targeting 4,813 genes with known associated clinical phenotypes. 
Clinical research laboratories can use this panel to expand existing menus, 
streamline workflows, or create an entire portfolio of sequencing options, 
with benefits including increased productivity, reduced handling errors, 
and decreased costs. The power of the TruSight One Sequencing Panel is 
enhanced by Illumina’s user-friendly VariantStudio analysis and reporting 
software. To coincide with the introduction of the TruSight One Sequencing 
Panel, VariantStudio will offer new features that expand annotation and fil- 
tering capabilities. These features include support for enabling family-based 
filtering (mother, father, child, and siblings), providing variant classifica- 
tions, and generating ready-to-use reports. In addition, the TruSight One 
Sequencing Panel can be used to create dozens of “virtual subpanels” to fit 
the needs of any clinical researcher seeking to understand the genetic basis 
of disease. 

Wlumina 

For info: 800-809-4566 | www.illumina.com/powerofone 


DNA METHYLATION ANALYSIS KITS 

New, high-efficiency BioArray conversion kits for detection of 5 mC and 5 
hmC modifications (Express DNA Methylation Kit, Blood & Tissue DNA 
Methylation Kit, and BioArray 5-hmC Methylation Kit) allow for resolu- 
tion of single-nucleotide changes without the introduction of DNA dam- 
age. Unique features such as capacity for multiple sample inputs, rapid pro- 
cessing time, and high conversion efficiency rate, provide enhanced flexibil- 
ity without sacrificing performance. For researchers requiring downstream 
analysis products, new enzyme-linked immunosorbent assay (ELISA)-based 
detection kits for 5 mC and 5 hmC exhibit high specificity comparable to 
mass spectrometry analysis and are amenable to high throughput process- 
ing. Complementing the ELISA kits is a new real-time polymerase chain re- 
action-based detection platform, the BioPanel DNA Methylation Detection 
Kit for Human Pluripotent Stem Cells. This product is designed to quantify 
the percentage of DNA methylation in six gene promoters, RAB25, NA- 
NOG, PTPN6, MGMT, GBP3, and LYST. 

Enzo Life Sciences 

For info: 800-942-0430 | www.enzolifesciences.com 


ION CHANNEL DRUG SCREENING 

The new SyncroPatch 384 Patch Engine (PE) propels ion channel drug 
discovery to a new level. Designed for seamless integration into process- 
automated drug screening environments, the Patch Engine is equipped with 
384 patch clamp amplifiers and an advanced 384 channel liquid handling 
robot. The SyncroPatch 384 PE is the high-quality, automated patch clamp 
system able to finally thrust gold standard electrophysiology from second- 
ary to primary ion channel drug screening. Allowing for up to 20,000 data 
points per day, it is the most efficient platform on the market for high qual- 
ity, ion channel recordings. This efficiency is primarily due to fully parallel 
measurements from 384 cells, the 384-channel pipettor, and exceptionally 
efficient control and analysis software. Both hardware and software have 
been fully tested and validated with leading players in industrial ion channel 
drug development to provide optimum performance in true high through- 
put ion channel screening. 

Nanion Technologies 

For info: +49-89-2189-97972 | www.nanion.de 


ddPCR LIBRARY QUANTIFICATION KIT 

The new droplet digital polymerase chain reaction (ddPCR) library quan- 
tification kit is designed for Ion Torrent library preparation. Used with Bio- 
Rad’s QX200 Droplet Digital PCR system, the new kit provides research- 
ers with the ability to precisely and directly measure amplifiable library 
concentrations. The Ion AmpliSeq library kit is used to prepare libraries 
for Ion Torrent next generation sequencing systems. Using the ddPCR li- 
brary quantification kit to quantify Ion AmpliSeq gDNA and RNA libraries 
maximizes the number of useable reads, enables consistent loading, and 
optimizes the utilization of every sequencing run. The resulting data pro- 
vide additional measures of library quality not provided by other meth- 
ods, including the percentage of nonamplifiable species such as adapter 
dimers and the size range of library inserts. Additional key benefits of the 
ddPCR library quantification kit for Ion Torrent systems include superior 
performance, visualization of library quality, and efficient utilization of 
sequencing runs. 

Bio-Rad Laboratories 


For info: 800-424-6723 | www.bio-rad.com/ion-torrent 


Electronically submit your new product description or product literature information! Go to www.sciencemag.org/products/newproducts.dtl for more information. 


Newly offered instrumentation, apparatus, and laboratory materials of interest to researchers in all disciplines in academic, industrial, and governmental organiza- 
tions are featured in this space. Emphasis is given to purpose, chief characteristics, and availability of products and materials. Endorsement by Science or AAAS of 
any products or materials mentioned is not implied. Additional information may be obtained from the manufacturer or supplier. 


www.sciencemag.org/products 
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MN AAAS!2015 
ANNUAL MEETING 


12-16 FEBRUARY e SAN JOSE, CA 


TEAM SAN JOSE 


Call for Symposium Proposals 


Symposium proposals for the 2015 AAAS Annual Meeting are now being solicited. To submit 
a proposal, visit www.aaas.org/meetings. The deadline for submission is 25 April 2014. 


Innovations in Information and Imaging 


Science and technology are being transformed by new ways to collect 
and use information. Progress in all fields is increasingly driven by the 
ability to organize, visualize, and analyze data. Advances in information 
and imaging technologies are generating novel applications in fields 
such as biochemistry, computer science, particle physics, genomics, and 
oceanography, and creating ways to interpret data across disciplines. 
This transformation makes scientific information more open, available, 
and accessible globally. The escalating amount of data, and advances 
in data analysis, are changing the ways we discover answers to scientific 
and societal problems. Thoughtful consideration of how information 

is used for societal benefit, evaluated for potential risks, and 
communicated beyond the scientific community will allow this 


revolution to reach its full potential. 


MVAAA 


ADVANCING SCIENCE, SERVING SOCIETY 


CellTox™ Green 
More Biology, Less Work 


CellTox™ Green enables real-time mechanistic toxicity monitoring with a simple 
Add & Read protocol. Multiplexing with CellTiter-Glo® allows investigators 

to monitor temporal changes of membrane-modulated 

cytotoxicity in parallel with the key cell 

viability biomarker, ATP. 
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Get more informative data from same-well 
multiplexing of cytotoxicity and viability assays. 


To see how easy better biology can be, request a free sample at: 
www.promega.com/realtimecytotoxicity 


©2014 Promega Corporation. All rights reserved. 22410-5471 


Science Careers 
Advertising 


For full advertising details, go to 
ScienceCareers.org and click 
For Employers, or call one of 
our representatives. 


Tracy Holmes 

Worldwide Associate Director 
Science Careers 

Phone: +44 (0) 1223 326525 


E-mail: advertise@sciencecareers.org 
Fax: 202-289-6742 


Tina Burks 
Phone: 202-326-6577 


Marci Gallun 
Phone: 202-326-6582 


Online Job Posting Questions 
Phone: 202-312-6375 


E-mail: ads@science-int.co.uk 
Fax: +44 (0) 1223 326532 


Axel Gesatzki 
Phone: +44 (0)1223 326529 


Sarah Lelarge 
Phone: +44 (0) 1223 326527 


Kelly Grace 
Phone: +44 (0) 1223 326528 


Yuri Kobayashi 
Phone: +81-(0)90-9110-1719 
E-mail: ykobayas@aaas.org 


Ruolei Wu 
Phone: +86-1367-1015-294 
E-mail: rwu@aaas.org 


All ads submitted for publication must comply 
with applicable U.S. and non-U.S. laws. Science 
reserves the right to refuse any advertisement 
at its sole discretion for any reason, including 
without limitation for offensive language or 
inappropriate content, and all advertising is 
subject to publisher approval. Science encour- 
ages our readers to alert us to any ads that 
they feel may be discriminatory or offensive. 
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2014 
Science /UCSF 
Career Event 


March 21, 2014 ¢ San Francisco, CA 
UCSF Mission Bay Community Center * 10-4:45 


i) 


Science Careers and UCSF have joined forces to deliver 
an exciting event that includes five career workshops 
and a chance to meet face to face with recruiters. Two 
of the five workshops will focus on career opportunities 
in Asia. 


Job seekers: Visit the Mission Bay campus for a 
chance to get valuable advice from career experts 
and to meet with recruiters from some of the top 
scientific organizations. The combination of valuable 
career development content and exciting career 
opportunities makes this a “must-attend” event for 
scientists in the bay area. For more details and to 
register, visit 


Employers: Save time and money by meeting 
hundreds of scientists in person. If your organization 
would like to recruit at this event, please call 
202 326-6577 for more information or e-mail 


SI From the journal Science PANAAA: 
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For careers in science, there’s only one 


Research Positions 


Careers —. 


Applied Science, Technology, | oho National lobo MIL, 


and Engineering Research 


INL’s Energy and Environment Science and Technology Directorate is seeking 
outstanding , highly creative and motivated early to mid-level career profes- 
sionals to join our multi-disciplinary research teams. The Directorate is INL’s 
principal multi-mission organization focused on research to advance clean 
energy systems, advanced transportation, advanced process technology and 
related sciences. 


Multiple positions are available in the following areas: 

* Bioenergy research including biomass characterization, conversion, prepro- 
cessing and molecular biology. * Analytical chemistry specializing in laser spec- 
troscopy and mass spectrometry. * Materials science and physics with focus on 
performance in harsh environments, nondestructive evaluation, and radiation 
based imaging. * Membrane science with a focus on dense film separation and 
filtration. * Geology and Hydrology with a focus on hydraulic fracturing and 
geothermal evaluation. * Scientific visualization research in a CAVE environ- 
ment with a focus on immerse visualization, virtual reality, graphics program- 
ming, and large format display technologies. 


The Idaho National Laboratory is a science-based, applied engineering national 
laboratory dedicated to supporting the U.S. Department of Energy’s mission 
in nuclear energy research, science, and national defense. With 3,800 scien- 
tists, researchers and support staff, the laboratory works with national and 
international governments, universities and industry partners to discover new 
science, develop technologies that underpin the nation’s nuclear and renew- 
able energy, national security and environmental missions. The Laboratory is 
a multi-program national laboratory. It currently performs a range of research 
and development activities associated with energy and national security. The 
laboratory currently has more than 150,000 sq. ft. of modern research facilities 
with a significant amount of wet laboratory space that are supported by a vast 
array of state-of-the-art research instrumentation. Idaho Falls is conveniently 
situated near many national treasures such as Yellowstone National Park, Teton 
National Park, Jackson, WY, etc. For more information about the area, please 
visit www. visitidahofalls.com and www.visitidaho.org. Interested parties 
should visit our website at www.inl.gov. 


INL is an Equal Opportunity Employer M/F/D/V 


VIRGINIA COMMONWEALTH UMIVERSITY 


Seeking Applications for Multiple 
Faculty Positions 
School of Engineering 


The School of Engineering at Virginia Commonwealth University in 
Richmond, Virginia, is seeking qualified candidates for tenure-track 
faculty positions at the Assistant or Associate Professor level for Fall 
2014. Founded in 1996, the School stands as a remarkable example of 
public-private partnership. Due to our expansion plans, we are seeking 
faculty in all five departments — biomedical engineering, chemical and 
life science engineering, computer science, electrical and computer 
engineering, and mechanical and nuclear engineering. 


Our School is seeking candidates with research experience and 
publications that make a positive difference in our community and human 
kind. The School’s strategic plan places particular emphasis on the growth 
of basic and applied research related to six areas of multi-disciplinary 
research including: sustainability and energy engineering, micro and nano 
electronic systems, pharmaceutical engineering, mechanobiology and 
regenerative medicine, security and mining of big data, and device design 
and development. Collaborative research and scholarly activity with the 
medical campus faculty and its associated Schools will be required, as 
well as engagement with the VCU Hospitals (as appropriate). 


The School is projecting an undergraduate enrollment of 2000 and a 
graduate student enrollment of 500 with a faculty of 115 by 2020. 
Candidates must have demonstrated experience working in and fostering 
a diverse faculty, staff, and student environment or commitment to do so 
as a faculty member at VCU. 


For further information concerning these jobs, or to apply, please go to: 
http://www.pubinfo.vcu.edu/facjobs/searchunitNew.asp?Item= 
Engineering 
Virginia Commonwealth University is an Equal Opportunity/Affirmative 
Action Employer. Women, minorities and persons with disabilities are 
encouraged to apply. 


Technical University of Denmark DTU 


SCIENCE AND TECHNOLOGY AT A GLOBAL SCALE 
- SET THE STANDARDS FOR THE FUTURE 
See our PhD-programmes at dtu.dk/phd 


online @sciencecareers.org 


Science Careers 


University of Colorado 
Anschutz Medical Campus 


Assistant/Associate Professor 
Department of Biochemistry and Molecular Genetics 


The Department of Biochemistry and Molecular Genetics at the Univer- 
sity of Colorado School of Medicine invites applications for faculty in 
the area of structural biology, with a focus on the use of single particle 
cryo-electron microscopy or mass spectrometry-based proteomics. 
Ideal candidates will integrate these methods with a variety of structural 
biology and biochemical techniques to address fundamental questions in 
molecular biology. We are seeking an entry level (Assistant Professor) 
or mid-career (Associate Professor) colleague. 


Successful candidates will be expected to establish an innovative, inde- 
pendent and collaborative research program and participate in teaching. 
They will join a highly interactive, interdisciplinary group of faculty, 
students, and fellows, and enjoy access to state-of-the-art equipment and 
facilities on our new campus. Candidates must hold a PhD (or equivalent) 
degree and have a strong record of research accomplishment. 


Applications are accepted electronically at www.jobsatcu.com, refer to 
job posting F01090. Applicants should submit a current CV, a cover letter, 
a statement of Research Accomplishments and plans (2 page maximum), 
and the contact information of at least 3 professional references. 


We will begin reviewing applications March 15, 2014. 


The University of Colorado strongly supports the principle of diversity. 
We encourage applications from women, ethnic minorities, persons 
with disabilities and all veterans. The University of Colorado is 
committed to diversity and equality in education and employment. 


UNIVERSITY OF 
CAMBRIDGE www.jobs.cam.ac.uk 


Research Associate 


Department of Pharmacology + £28,132-£36,661 (depending on experience) 


Multidrug transporters mediate the active extrusion of cytotoxic agents away from 
their intracellular targets. They are pharmacologically important proteins in microbial 
pathogens that are associated with some of the most devastating diseases in the world; 
in this capacity they can impair antimicrobial chemotherapy. 


A post-doctoral position is available for up to 2.5 years in the Department of Pharmacology, 
University of Cambridge, under supervision of Dr Hendrik van Veen. Van Veen’s research 
group aims at the molecular mechanisms of recognition and transport of chemotherapeutic 
drugs by efflux pumps from pathogenic bacteria and cancer cells. 


The project will focus on the molecular mechanisms of tripartite drug efflux pumps, 
such as AcrAB-ToIC and MacAB-TolC in Escherichia coli, and will involve biochemical 
and electrophysiological techniques in cells and artificial membrane vesicle systems 
to study substrate transport and its energetics. The project is funded by a programme 
grant of the Human Frontier Science Program Organization (HFSPO). 


Applicants should be highly motivated, enthusiastic individuals, capable of thinking 
and working independently, and must have experience with membrane proteins. 
Candidates should have or shortly expect to obtain a PhD in a related subject. 

Limited funding: The funds for this post are available for 2.5 years in the first instance. 


Further information can be found on our website at: 
http://www.phar.cam.ac.uk/research/vanveen 

To apply online for this vacancy and to view further information about the role, 
please visit: http://www.jobs.cam.ac.uk/job/3081 

This will take you to the role on the University’s Job Opportunities pages. There 
you will need to click on the ‘Apply online’ button and register an account with the 
University’s Web Recruitment System (if you have not already) and log in before 
completing the online application form. 

Applications, to include a CV, cover sheet (CHRIS/6, Parts | and III only) including the 
names and addresses of two referees, a brief statement (two sides maximum) of future 
research plans and a list of publications, should be submitted to the Departmental 
Administrator, Ms Jessica Dunne, Department of Pharmacology, Tennis Court Road, 
Cambridge, CB2 1PD, Tel: (01223) 334002, e-mail: recruitment@phar.cam.ac.uk 

Please quote reference PLO2632 on your application and in any correspondence 
about this vacancy. 

Closing date: Friday 7 March 2014 

The University values diversity and is committed to equality of opportunity. 


The University has a responsibility to ensure that all employees are eligible to live and work in the UK. 


generationnexi 


Medlmmune is committed to investing in the talented, scientific minds 
of the next generation. We're looking for motivated and innovative 
post-doctoral scientists who bring a passion for great ideas, fresh 
thinking to scientific challenges, and a desire to make a difference. 


Visit MedImmune.com/careers 
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Download your free copy today at 


&) Prescott Medical Communications Group 


Medical Writer 


The Prescott Medical Communications 
Group, a health care communications 
company serving the pharmaceutical and 
biotechnology industries, has immediate 
openings for scientific writers/editors. Can- 
didates must possess an advanced biomedical 
science degree (MS, PhD, PharmD, MD) and 
excellent oral/written communications skills. 
Previous experience in CME or agency setting 
is desirable, as is a background in neurosci- 
ence and/or cardiopulmonary medicine. The 
available positions are full-time in-house, 
and will require residing in the Chicago area 
with occasional domestic/international travel. 
PMGG offers an unparalleled opportunity for 
professional development in a fast-paced and 
intellectually challenging environment. Our 
healthcare package is the best: BCBS Health- 
care Savings Accounts, or PPO, in addition to 
a host of top-of-the-line benefits. 


Please send employment history, salary 
requirements, and three writing samples 
(e.g., manuscript, slide deck, and scientific 
poster) to: 
Jim Bachleda 
Prescott Medical Communications Group 
205 N. Michigan Avenue, Suite 3400 
Chicago IL 60601 
Fax: 312.528.3901 
Email: jbachleda@prescottmed.com 


Max Planck Institute of Immunobiology and Epigenetics » 


Max-Planck-Institut fiir Immunbiologie und Epigenetik 


We are offering 


Postdoctoral Positions in Immunology 


The Max Planck Institute of Immunobiology and Epigenetics in Freiburg, Germany, is offering several 
Postdoctoral Positions in the Department of Developmental Immunology (Head: Dr. Thomas Boehm). The 
positions are available for initial two-year appointments with the possibility of extension. 


The MPI in Freiburg is an international research institute at the cross-road of Southern Germany, Switzerland 
and France. The working language is English. State-of-the-art infrastructure and service units including mouse 
and zebrafish facilities, flow cytometry, imaging, mass spectrometry and proteomics units are available. 


Research in the Department of Developmental Immunology covers a broad range of topics. We are interested in 
the functional characteristics of the alternative adaptive immune system in lampreys (see Nature 470, 90, 2011; 
Nature 501, 435, 2013), the structure of the immune systems of cartilaginous and bony fishes (see Nature 505, 
174, 2014; PNAS 110, 6043, 2013; J. Immunol. 186, 7060, 2011) and the function of the thymus (see Nature 
441, 992, 2006; Cell 138, 186, 2009; Cell 149, 159, 2012). 

Your qualifications: 

We seek enthusiastic, highly motivated and science-driven postdoctoral fellows to join our research activities 
in these areas. Applications from scientists with proven experience in bioinformatics, cellular immunology and 
mouse genetics are particularly welcome. 


Please ask three referees to send recommendation letters directly to kirk@ie-freiburg.mpg.de 


We offer: 
Salaries are in accordance with the postdoctoral fellowship guidelines of the Max Planck Society or TV6D and 
commensurate with experience. 


Application deadline: 15.03.2014 


Handicapped applicants with equal qualifications will be given preferential treatment. The Max Planck Society 
seeks to increase the number of women in areas where they are underrepresented, and therefore explicitly 
encourages women to apply. A childcare facility is directly attached to the Institute. 
Max Planck Institute of Immunobiology and Epigenetics 
Ms. Stallone 
Please apply online via the Jobmarket at our website. We are looking forward to receiving your complete 
application documents. 


http://www 1.ie-freiburg.mpg.de/jobs 


USDA 
VOL” 


United States Department of Agriculture 
National Institute of Food and Agriculture 


The NATIONAL INSTITUTE OF FOOD AND 
AGRICULTURE (NIFA) in the DEPARTMENT 
OF AGRICULTURE is seeking to fill the Senior 
Executive Service (SES) position of Deputy 
Director of the Institute of Food Production 
and Sustainability (IFPS). 


The Deputy Director of IFPS is responsible for 
scientific and managerial leadership and direc- 
tion in formulating and implementing policies 
and programs that support research, education, 
and extension programs leading to new science 
to sustainably boost U.S. agricultural production, 
improve global capacity to meet the growing food 
demand, and foster innovation in fighting hunger 
by addressing food security. The incumbent serves 
as a principal scientific and management advisor 
in administering, evaluating, planning, directing, 
and coordinating activities related to the mission 
and function of NIFA, in execution of policies 
and practices of grants management, and in the 
allocation of resources to carry out these policies 
and practices. More information about NIFA can 
be found at http://www.nifa.usda.gov/. 


A copy of the job announcement (AG-22-014- 
0012) is available at https://www.usajobs.gov/. 
All applications must be received by March 
12, 2014. 


U.S. CITIZENSHIP REQUIRED. USDA IS AN 
EQUAL OPPORTUNITY PROVIDER AND 
EMPLOYER. 


CARNEGIE 


INSTITUTION FOR 


SCIENCE 
PRESIDENT 


The Carnegie Institution for Science invites nominations and applications for the 
position of President. 


The Carnegie Institution for Science was founded by Andrew Carnegie in 1902 “to 
encourage, in the broadest and most liberal manner, investigation, research, and 
discovery and the application of knowledge to the improvement of mankind...” One 
of the few organizations that allows scientists the freedom to independently explore 
new directions, the Carnegie Institution has remained at the forefront of scientific 
discovery since its founding. 


The Carnegie Institution is supported by an endowment of approximately $1 billion, 
and is currently comprised of six research departments: the Observatories, in 
Pasadena, California, and Las Campanas, Chile; the Department of Terrestrial 
Magnetism and the Geophysical Laboratory, both in Washington, D.C.; the 
Department of Embryology in Baltimore, Maryland; and the Departments of Plant 
Biology and of Global Ecology in Stanford, California. 


The Carnegie Institution seeks a bold, inspirational leader to serve as its next 
President. The successful candidate will be an eminent scholar with an outstanding 
record of peer-reviewed research. She/he will have successfully led a large and 
complex organization, will have interest and experience in attracting resources for 
scientific endeavors, and will possess unquestionable integrity. The President will be 
a catalyst for great science. 


All nominations and applications will be treated with the strictest confidence. 
Interested candidates should submit a C.V. and a summary of accomplishments to: 
Carnegie Institution for Science Presidential Search Committee 
carnegie@russellreynolds.com 


The Carnegie Institution for Science is an equal opportunity / affirmative action employer. 


online @sciencecareers.org 
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For your career in science, there’s only one : 
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INDIVIDUAL 
DEVELOPMENT 
xs 


Skills Development Goals 


is 
Overview Summary My Skills to Improve My SMART Goa 


Quick Tips 
Personal Information 


' 5 ch 
Now, use the tool below to set specific goals for how you will improve eact ¢ 


Add a new SMART Goal 
Skills Assessment 
Broad based knowledge of 
Critical evaluation of scent 
Interpretation of data 

Creativity /inmovative thins 
Navigating the peer review 
Deep knowledge my speaf 
‘Technical skills related to 


Interests Assessment 


Values Assessment 


Select a skill to add a goal for 
Consider Career Fit 
Read About Careers 
Attend Events 


SMART Goal 
Talk to People 


Is this a recurring activity? No 


Choose a Career Path 
Start Date 


Career Advancement 
Goals 


Target Completion Date 


° 
How will you be accountable? 


Add SMART Gos! add & Move to Next Step 


Project Goals 


Mentoring Team 


myIDP Summary 


Recommended by leading professional societies and endorsed by the National Institutes of Health, 
an individual development plan will help you prepare for a successful and satisfying scientific career. 


In collaboration with FASEB, UCSF, and the Medical College of Wisconsin and with 
support from the Burroughs Wellcome Fund, AAAS and Science Careers present 


the first and only online app that helps scientists prepare their very own 
individual development plan. 


Careers| 


—_—_—— _In partnership with: 


whe sciences 


H & FASEB 


Federation of American Societies 
Cc for Experimental Biology 


UCsr 


University of California 
San Francisco 


Visit the website and 
start planning today! 
myIDP.sciencecareers.org 


BURROUGHS 
WELLCOME 


FUND fe 


ED 
COLLEGE 
OF WISCONSIN 


WOMEN 
IN SCIENCE 


forging 
new pathways in 
green 
science 


Read inspiring stories 
of women working in 
“Green Science” 
who are blending 
a unique combination of 
enthusiasm for science 
and concern for others 
to make the world 
a better place. 


Download this 
free booklet 
ScienceCareers.org/ 
LOrealWiS 


Science 


This booklet is brought to you by the 
AAAS/Science Business Office 
in partnership with the 
L’Oreal Foundation 
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Associate Editor for Science 


Join the editorial team at Science. We are seeking a full time Associate Editor in the physical sciences 
to work preferably in our Washington, DC, USA or Cambridge, UK office. 


We are looking for a scientist with broad interests, a lively curiosity, and experience with cutting-edge 
research in at least one, but preferably more than one, of the following fields: 

* geophysics 

* seismology 

* mineral physics and mechanics 

* materials science 


Responsibilities include managing the selection, review, and editing of research manuscripts, work- 
ing with authors on revisions, soliciting review articles and special issues, and fostering contacts 
and communication with the scientific community. Candidates are expected to travel to scientific 
meetings. 


A Ph.D. in a scientific discipline, postdoctoral experience and multiple publications are required, 
as is the ability to work constructively as a member of a team. Previous editorial experience is not 
necessary, but evidence of an aptitude and passion for the communication of science is essential. 


Science is published by the AAAS, the world’s largest general scientific membership organization. 
Visit us at www.sciencemag.org and www.aaas.org. EOE. Non-smoking work environment. 


For consideration send a cover letter and resume, along with your salary requirements, to 
AAAS 
Attention: Executive Editor (Request #1732) 
Human Resources Office, Suite 101 
1200 New York Ave., NW 
Washington, DC 20005 
or, by email, to jobs@aaas.org or, by fax, to 202-682-1630 


AAAS is here - helping scientists achieve career success. 


Every month, over 400,000 students and scientists visit ScienceCareers.org in search of 
the information, advice, and opportunities they need to take the next step in their careers. 


A complete career resource, free to the public, Science Careers offers a suite of tools 
and services developed specifically for scientists. With hundreds of career development 
articles, webinars and downloadable booklets filled with practical advice, a community 
forum providing answers to career questions, and thousands of job listings in academia, 
government, and industry, Science Careers has helped countless individuals prepare 
themselves for successful careers. 


As a AAAS member, your dues help AAAS make this service freely available to the 
scientific community. If you’re not a member, join us. Together we can make a difference. 


To learn more, visit 
aaas.org/plusyou/sciencecareers 
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The USDA, Agricultural Research Service, Invasive 
Insect and Behavior Laboratory in Beltsville Agricul- 
tural Research Center, Beltsville, Maryland is seeking a 
POSTDOCTORAL RESEARCH ASSOCIATE, 
Research Chemist for a Two Year Appointment. Ph.D. 
is required. Salary is commensurate with experience 
$63,091 to $82,019 per annum, plus benefits. Citi- 
zenship restrictions apply. The incumbent will work 
in the Crop Protection and Quarantine Program and 
will conduct research with the long-term goal of de- 
veloping unique behavioral and biological pest man- 
agement strategies to manage insect pest and other 
arthropod populations below economic injury levels 
using environmentally sound methods. The incumbent 
will conduct semiochemical identification and chemical 
synthesis to develop efficient crop protection tools for the 
brown marmorated stinkbug management. The position 
requires a recent degree in chemistry as well as knowl- 
edge of organic synthesis, chemical ecology, analytical 
chemistry, entomology, biology, and skill in the use of 
statistical analysis software. Refer to website: http:// 
www.afin.ars.usda.gov/divisions/hrd/hrdhomepage/ 
vacancy /pd962.html for further information on Post- 
doctoral Research Associate Jobs. For complete applica- 
tion instructions, and the full text announcement, refer 
to RA13-087-H on website: https://www.usajobs. 
gov/GetJob/ViewDetails/359234000. Send appli- 
cation materials and references to Dr. Aijun Zhang, 
USDA, Agricultural Research Service, Invasive Insect 
Biocontrol and Behavior Laboratory, Bldg. 007, Rm. 
312, BARC-West, 10300 Baltimore Avenue, Beltsville, 
M.D. 20705-2350. Telephone: 301-504-5223 (Office), 
fax: 301-504-6580, e-mail: aijun.zhang@ars.usda. 
gov. USDA/ARS is an Equal Opportunity Employer and Provider. 
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From technology specialists to patent 
attorneys to policy advisers, learn more 
about the types of careers that scientists 
can pursue and the skills needed in order 
to succeed in nonresearch careers. 
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